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Abstract 



The computational problem of distinguishing two quantum channels is central to 
quantum computing. It is a generalization of the well-known satisfiability problem 
from classical to quantum computation. This problem is shown to be surprisingly 
hard: it is complete for the class QIP of problems that have quantum interactive proof 
systems, which implies that it is hard for the class PSPACE of problems solvable by a 
classical computation in polynomial space. 

Several restrictions of distinguishability are also shown to be hard. It is no easier 
when restricted to quantum computations of logarithmic depth, to mixed-unitary 
channels, to degradable channels, or to antidegradable channels. These hardness 
results are demonstrated by finding reductions between these classes of quantum 
channels. These techniques have applications outside the distinguishability problem, 
as the construction for mixed-unitary channels is used to prove that the additivity 
problem for the classical capacity of quantum channels can be equivalently restricted 
to the mixed unitary channels. 
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Chapter 1 
Introduction 



Distinguishing two quantum channels is one of the most important tasks in quantum 
information. This is the problem of determining if there is an input state on which the 
two channels to produce output states that are distinguishable. When this is phrased as 
a computational problem it is complete for the complexity class QIP of problems that 
have quantum interactive proof systems. This problem seems to be computationally 
much more difficult than other variants of the problem, such as distinguishing classical 
circuits or distinguishing unitary quantum circuits. 

In light of this hardness, it is natural to consider restricted versions of the problem. 
Many of these special cases are also hard: reductions can be found to some of the more 
interesting classes of quantum channels. These results suggest that this problem is not 
likely to be tractable even on many of the restricted channels that can be realized by 
experiment. This is, however, not a surprise: distinguishing two channels is a restricted 
version of quantum process tomography, which is computationally intractable for large 
systems. 

These reductions provide simulations of general quantum channels by channels 
in restricted classes. While these simulations do not accurately model all aspects of 
the original channel, the constructed simulations do share many properties with the 
original channel. Many of these results can be applied outside the narrow focus of 
distinguishing quantum channels: it is hoped that these techniques will prove useful 
for a number of problems in quantum information theory. 

Contents 

1.1 Overview |2] 

1.2 Quantum information [6] 

1.2.1 Hilbert spaces [Zl 
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1.2.2 Pure states |9] 

1.2.3 Linear operators [10] 

1.2.4 Mixed states [H 

1.2.5 State evolution and measurement [15] 

1.2.6 Channels M 

1.3 Classes of quantum channels [18] 

1.3.1 Circuit restrictions [18] 

1.3.2 Degradable and antidegradable channels [20] 

1.3.3 Entanglement-breaking channels [21] 

1.3.4 Unital channels [21 

1.3.5 Mixed-unitary channels [23] 



1.1 Overview 

This thesis studies the computational problem of distinguishing quantum channels. 
This problem asks, given two implementations of quantum channels, is there an in- 
put on which the implemented channels behave distinctly? One of the main results 
of the thesis is that this problem is in general extremely difficult: it is complete for 
the complexity class PSPACE of problems that can be solved with a polynomially 
bounded amount of memory. Since this problem is intractable in general it remains 
to understand those classes of channels for which the problem has an efficient so- 
lution and those classes on which it remains hard. This problem is becoming more 
significant for quantum computing: as larger practical systems are being studied it 
becomes more difficult to verify that the implemented transformation is close to an 
ideal transformation. 

One of the other problems considered in the thesis is the question of the additivity 
of the Holevo capacity for quantum channels. If this quantity were additive, it would 
significantly simplify the tasks of encoding and decoding for the transmission of 
classical information through a quantum channel. Specifically, this question asks 
whether two uses of a channel can send more than twice the classical information that 
can be sent with only one use of the channel. That this might be possible is because 
entanglement may be present between the two inputs to the channel. This question 
stood open for several years until it was recently shown that there exist channels for 
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i = l,2 



Figure 1.1: The optimal strategy for determining which channel Oi, ^2 is the unknown 
channel 0|. The strategy is to prepare some state p on which the two channels output 
maximally distinguishable states, send this state through the channel Oi, and then 
make an optimal measurement of the result. 

which the Holevo capacity is super-additive l|Has09L The classes of channels that are 
known to have the additivity property are generally quite restricted. It is important 
to better understand which classes of channels are additive as the use of quantum 
channels for sending classical information is an important application of quantum 
information. 

The problem of distinguishing channels can be equivalently rephrased as: given 
a single use of an unknown channel that is one of two known channels and O2, 
what is the probability that the optimal strategy can detect which of the two channels 
it is? In general the best strategy in this case is to prepare an input state p on which 
the output states of Oi and O2 are maximally distinguishable, send p through the 
unknown channel, and then attempt to solve the distinguishability problem for the 
output states of the channels. In general this strategy requires the preparation of a 
state on some larger system, only part of which is sent through the unknown channel. 



This strategy is illustrated in Figure 1.1 One of the main results of the thesis is that this 
problem, properly formalized, is complete for the complexity class QIP of problems 
that have quantum interactive proof systems. This is a surprising result: the same 
problem restricted to deterministic classical circuits is a restatement of the canonical 
NP-complete problem satisfiability. These complexity classes are discussed in more 
detail in Chapter |2j For the reader unfamiliar with computational complexity theory, 
it is important only to know that the class QIP contains the class NP and it is thought 
that QIP is much larger than NP. 

This hardness result is found in Chapter|5| where it is shown using a Karp reduction 
from the problem of determining if two quantum channels can be made to output states 
that are close together, where a Karp reduction is simply an efficient procedure that 
transforms instances of one problem into equivalent instances of another problem. A 
problem that is the target of a Karp reduction is thus shown to be at least as hard as 
the starting problem. 



3 



Degradable QCD Mixed Unitary QCD 

Chapter 4\ / Chapter ^ 
Log-depth QCD 



Chapter^ 
Chapter ,4 



Log-depth CI 



• Close Images 



Figure 1.2: Reductions presented in the thesis. Problems are reduced to those problems 
above them. CI and QCD are shorthand for Close Images and Quantum Circuit 
Distinguishability. Edges are marked with the chapter the reduction appears in. 

The Close Images problem is easily derived from the definition of the class QIP, 
which implies that it is also QlP-complete. This derivation can be found in Chapter|4| 



and is originally due to Kitaev and Watrous HKWOOi 



Given that the distinguishability problem is intractable, much of the remainder of 
the thesis is a study of several restricted classes of quantum channels, with a focus on 
the hardness of this distinguishability problem on them. For many of these classes 
Karp reductions are found from the general problem to the problem on the restricted 
class. These reductions prove that these restricted versions of the distinguishability 
problem are also QlP-complete. The reductions found in the thesis are illustrated in 



Figure 1.2 



The problems shown to be QlP-complete using these reductions are also hard for the 
more familiar class PSPACE. In fact, it has recently been shown that QIP = PSPACE, 



which implies that these two classes are the same [JJUW09J . These problems on quan- 
tum channels then provide an interesting characterization of a fundamental classical 
complexity class. Despite this equivalence, the class is referred to as QIP throughout 
the thesis, as the hardness results presented here all follow from the definition of the 
class in terms of quantum interactive proof systems. 

The first of these reductions, in Chapter |4] concerns not the distinguishability prob- 
lem, but the close images problem. It is shown that this problem can be equivalently 
restricted to the channels implemented by circuits of logarithmic depth. These chan- 
nels are an important class: they can be implemented in parallel in a logarithmic 
amount of time. This makes these channels interesting for a practical perspective, as 
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a quantum system implementing one of these channels needs to be protected from 
decoherence for only a short time. 

The second reduction presented in the thesis is the focus of Chapter |5] This is the 
reduction from the close images problem to the problem of distinguishing two chan- 
nels, where the channels are given as input to the problem in the form of circuits. This 
reduction proves the hardness of the distinguishability problem for general quantum 
channels. One other important property of this reduction is that it adds only logarith- 
mic overhead to the depth of the circuits. This implies that even the log-depth circuit 
distinguishability problem is QlP-complete, which provides powerful evidence that 
this problem, a restricted case of quantum process tomography, is likely to be difficult 
in practice even for computations that can be performed in a very limited amount of 
time. 

Chapter [6] extends the hardness of the distinguishability problem in a different 
direction: to the channels known as the degradable channels. These channels can be 
thought of as the channels that preserve most of the information about the input, since 
there exists a second channel that maps the output of a degradable channel to the state 
of the environment. These channels are described in more detail Section fOI This result 
implies that distinguishing these channels that do not lose very much information t 
the environment remains hard. 

In the other direction. Chapter [6] also contains a reduction of the distinguishability 
problem to the antidegradable channels. These are the channels for which there exists 
a second map that takes the environment state to the output state. In particular, this 
means that an eavesdropper with sufficient resources can reconstruct the output of the 
channel. At an intuitive level, this implies that an antidegradable channel loses more 
information to the environment than it preserves in the output state. The fact that 
distinguishing these channels is QlP-hard is evidence that even channels that do not 
preserve very much information are hard to distinguish. 

The final reduction presented in the thesis, in Chapter [7| is a transformation that 
approximates a general channel by one that is mixed-unitary. The mixed-unitary 
channels are those channels that can be expressed as the convex mixture of unitary 
(i.e. noise-free and reversible) channels. The mixed unitary channels have several 
nice properties that make them interesting in quantum information theory. The ap- 
proximate simulation of a channel by a mixed-unitary channel performs well only for 
measures of quality based on the maximized purity of the output of the channel. This, 
however, suffices to reduce the distinguishability problem to the mixed-unitary chan- 
nels. This technique is also used to show that the additivity of the Holevo capacity of 
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a general channel can be approximately restated in terms of a mixed-unitary channel. 
The Holevo capacity, which can be used to measure the amount of classical informa- 
tion that can be sent through a quantum channel, is introduced in detail in Chapter |3| 
This technique allows, for instance, the observation that this quantity is additive for 
all channels if and only if it is additive for mixed-unitary channels (which has recently 
been shown by Hastings l|Has09ll through the construction of a mixed-unitary channel 
that is not additive). 

Taken together, these reductions demonstrate the hardness of the distinguishability 
problem on several distinct classes of channels. As this problem is one of the most 
interesting problems in quantum computation, these hardness results point to cases 
of the problem that are not able to be efficiently solved, under the usual complexity 
theoretic assumptions. It is also hoped that these complete problems for QIP will 
provide a way to further understand this class. The problems shown in Figure 1.2 are 
among only a few problems in quantum information that are known to be complete 
for QIP. 

The technique of reducing a problem to a restricted class by simulating general 
channels by those of a restricted class can also have applications outside of quantum 
computational complexity. For instance, the reduction to mixed-unitary channels in 
Chapter |7] was initially constructed for the distinguishability problem, but the same 
construction has implications for the additivity of certain capacities. These techniques 
are powerful and general: any problem defined on quantum channels is a candidate for 
reduction to these restricted classes. This does not work in general, as these reductions 
produce channels that do not simulate the general channel in every sense, but for 
any problem defined using similar notions of distance on quantum channels, these 
reductions apply. These techniques provide not only a method for the study of the 
distinguishability problem, the primary application studied in this thesis, but a tool 
for the more general study of quantum channels and their properties. 



1.2 Quantum information 

In this section the necessary mathematical framework for the problems outlined in 
the previous section is introduced. The concepts and notation used here are relatively 
standard. This is not a complete introduction to quantum information. 

More background on quantum information can be found in the books llBZ06ilNC00l . 



Background on much of the linear algebra introduced here, including a thorough 
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discussion of the tensor product, can be found in IIRomOSH . A good general reference 
for results from functional analysis that are occasionally useful in quantum information 
is | |Con90| , while the books [Bha97[ |HJ85[ |HJ91[ provide a more focused treatment of 
the types of operators often found in quantum information 



1.2.1 Hilbert spaces 

The fundamental backdrop for quantum information is the complex Hilbert space. 
These spaces are the complete vector spaces over C with inner product, where the 
completeness of the space is with respect to the topology induced by the inner prod- 
uct. All such spaces considered in this thesis are finite dimensional and denoted by 
calligraphic letters %,%,.... Elements of a Hilbert space "K of dimension d space 
can be represented as vectors in C^. These vectors are denoted |(|)). Elements of the 
dual space "K*, which are (complex) linear functionals on the space %, are denoted 
(4)1 = (Icj)))*. The inner product on these Hilbert spaces is defined, for two vectors 14)) 
with elements Ut and \\^) with elements V|, by 

(I4)),|i^)) = (4)1^1^) =^u,v,, 

i 

where u denotes the complex conjugate of u. This inner product is linear in the second 
argument and conjugate linear in the first: this is the usual convention in physics, but 
it is opposite to the way things are typically defined in mathematics. The dimension 
of a space has been mentioned several times: this is simply the maximum number of 
elements in a pairwise orthogonal set. When the elements of a d-dimensional space 
are viewed as vectors with complex entries, the dimension of the space coincides with 
the length of these vectors. 

Norms are a fundamental tool in quantum information. They provide a means to 
define a notion of size on quantum states and quantum channels. Throughout the 
thesis, we will be most interested in using norms to bound the distance between two 
objects. 

Definition 1.1. A norm 1 1 1 ■ 1 1 1 on some linear space V (over the field C) is a function from 
V to IR satisfying three basic properties, for all x, y G V: 

Nonnegativity: |||x|l| ^ with equality if and only if x = (1.1) 
Homogeneity: |||cx||| = |c| ||jx||| for all c G C (1.2) 
Triangle Inequality: lllx + y ||| ^ |||x||| + |||y 1|| (1.3) 



7 



The standard Euclidean norm on a Hilbert space can be dejfined in terms of the 
inner product given above. This is the norm of a vector |4)) with elements Vi given by 

iii*)ii = VJm = 




A vector |4>) is called normalized if || |cj)) || = 1. 

The standard basis of the Hilbert space "K of dimension d is given by the set of 
orthonormal (i.e. normalized and pairwise orthogonal) vectors {|0),|l),...,|d — 1)}. 
The vector |i) viewed as a vector in is simply the vector with a one in position i + 1 
and zeroes in all other positions. This basis is also known as the computational basis. 
When no confusion will arise, this basis will also be labelled {|1), |2), . . . |d)}. 

Two finite dimensional Hilbert spaces ^}{,, 3C are isomorphic if they are both of the 
same dimension. This is written J{ = %. In such a case, the canonical isomorphism 
between the two spaces simply maps the computational basis of J{ to the computational 
basis of %. When two spaces are isomorphic, the isomorphism between them will often 
be used implicitly to consider vectors in one Hilbert space as being vectors in the other 
space. 

Quantum systems of large dimension are often built up of many smaller di- 
mensional system. If !K and X are Hilbert spaces of dimension dim IK = dji and 
dim X = dx, then the Hilbert space of dimension djtdx formed by combining them is 
denoted !K(S>X. Similarly, the element 14)) (g) |\|;) e IK (8) 3C is formed by combining the 
two elements \<p) e !K and \^\)) e X. When viewed as complex vectors, these elements 
are given by the Kronecker product 

/ Ui \ / Vi \ / Uilll^) \ 



U2 



V2 



The notation |c|)) (g) ItJj) will often be abbreviated |4>) lij)) or even Icj)!];) where no confusion 
is likely to arise. 

The space "K^X does not consist solely of elements of the form 14)) li];): it also 
contains linear combinations of these elements. As an example, an element of the 
tensor product of two systems of dimension two is |00) + |11). This element cannot 
be written in tensor product form. A basis for "K^X can be formed by taking the 
pairwise tensor product of basis elements for the two subsystems, i.e. the set {|i)|j) : 
^ i < d^K/ ^ j < dx} is a basis for %^X. When convenient we will also use the 
standard basis {|i) : ^ i < d^^dx} for this space. 
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1.2.2 Pure states 



The state of a quantum system is described, up to a phase e^^, by a normalized vector 
14)) G !K, known as a pure state. Provided that there is no uncertainty about the system, 
these pure states suffice to completely describe the state of a quantum system. For this 
reason they are of fundamental importance in quantum information. Any such state 
on a d-dimensional Hilbert space can be expressed as 

d-1 

let") = ^ ai|i), 

1=0 

where the amplitudes at are complex numbers satisfying I ai|^ = 1, which is simply 
a restatement of the normalization requirement. 

The smallest system of interest in quantum computation is the two dimensional 
Hilbert space. Such a system is often referred to as a quhit. On such a system, the two 
standard basis states are |0) and A second basis that is often extremely useful is 
given by the two orthogonal states 

l+) = ^(|0) + |l)), |-) = ^(|0)-|1)). 

As was previously mentioned, there are elements in a composite Hilbert space 
"K^X that cannot be decomposed into a tensor product of an element of "K and an 
element of %. When quantum states have this property they are called entangled. Up 
to normalization, we have already met the maximally entangled state. This is the state 
14)+) G 'K®'K, where d — dim "K, given by 

1 '^"^ 
l*+) = ^^|i)|i). 

i=0 

Any state that is not entangled is called separable. 

An important representation of pure states of a composite system "K^X is the 
Schmidt decomposition. Any state \(^) E'K®% may be expressed as 

T 

14)) =_^A,|at)|bt). (1.4) 

i=l 

In this decomposition the sets {|ai)} and {|bi)} form orthonormal sets in % and %, 
respectively, and the coefficients At are all positive and real. The number r in Equa- 
tion ( |1.4| > satisfies r ^ min{dim "K, dim %] and is known as the Schmidt rank of |4))- The 
numbers are known as the Schmidt coefficients. They satisfy Y.i A? = 1- Notice that a 
pure state has Schmidt rank one if and only if it is separable: this is only one example 
of the utility of this decomposition in quantum information theory. 
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1.2.3 Linear operators 



In order to introduce how states evolve during a quantum computation, we must 
take a detour through some of the different spaces of linear operators that act on a 
Hilbert space "K. The most general of these is L(J{, X), which is the set of all linear 
operators that map elements of !K to elements of %. As we assume that all Hilbert 
spaces appearing in the thesis are finite dimensional, linearity implies boundedness 
which in turn implies continuity: this space is often referred to as B(J{, 3C) for this 
reason. When the Hilbert space that a linear operator acts on is viewed as the space of 
vectors C*^™^, the set L( J{, X) is exactly the set of dimX by dim CK complex matrices. 
The notation L(;K) is shorthand for L([K,IK). 

For A e L(IK,3C), the operator A* e L[X,'K) is the adjoint of A, in the sense that 
A* is the unique operator that, for any Icj)) e and any \^\)) e %, satisfies 

{\^\>),Am = {m\<\>) = {A*\^\>)A<t>)). 

When A is represented by a matrix. A* is the conjugate transpose of A. Given such 
a representation, the complex conjugate of A is denoted A, and the transpose of A is 
denoted A^. 

There are a few more classes of operators that are extremely important in quantum 
information. One these classes of operators is the class of Hermitian operators. These 
are those operators A e L(^K) such that A = A*. An important subclass of the 
Hermitian operators is the set of positive, or positive semidefinite, operators. These are 
the Hermitian operators A e L(iK) such that for any |4)) e IK 

(c^im) ^ 0. 

The positive operators can be equivalently characterized as those operators A e L(3i) 
that can be expressed as A = B*B for some B e L(IK). The notation A ^ B is used 
to denote that the operator A — B is positive, with the special case A ^ used to 
state that A is positive. The other important operators are the unitary operators. 
These are the invertible operators U G L(J{) with U* = U^^. It follows from this 
property that applying a unitary matrix to a pair of element of % does not change 
their inner product, which further implies that unitaries do not not change the norm. 
This property implies that the unitary operators are exactly the invertible operators 
that preserve the pure states, a property which makes them extremely important 
for quantum computing. One other important characterization is the that unitary 
operators are exactly those operators that map orthonormal bases to orthonormal 
bases. The set of all unitary operators in L(J{) is denoted U(IK). The notion of 
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unitarity can be extended to V e L(IK,3C) with dim3C ^ dim IK by considering those 
V with the property that V* V = 1 jc. Such an operator is called an isometry, and the set 
of all such operators is denoted U(^K, 3C). These operators embed the elements of the 
space !K into the larger space X. 

One of the most important operators in this space is 1^, which is the identity 
operator on J{. As a matrix, this operator has ones on the main diagonal and zeroes 
in all other positions. When restricted to qubits, the identity is one of the four Pauli 
matrices. These four matrices belonging to L([K), where dim !K = 2, are defined by 

^-(o;)' -Col' -(:°.)- 

One other matrix that will be consistently useful is the Hadamard matrix. This is the 
unitary operator that converts the basis {|0), |1)} to the basis {|+), |— )) and vice versa. 
This operator can be expressed in matrix form as 

V2\l -l) 

An extremely important function on linear operators is the trace. This is the opera- 
tion tr: L(IK) ^ C that, on a matrix representation of an operator A, is simply the sum 
of the main diagonal. One of the most important properties of the trace is that it is 
cyclic, i.e. for operators A, B, C we have 

tr(ABC) = tr(BCA) = tr(CAB), 

whenever the products in the above equation are defined. Note that the trace is not 
stable under more general commutation of the arguments, i.e. there are operators 
A, B, C such that tr(ABC) t tr(CBA). 

The space L(CK, 3C) equipped with the inner product given by 

(A,B) =tr(A*B] 

is also a Hilbert space. This implies that L(IK) ® L(DC) is well defined. In fact, it is the 
case that 

L(J{,3C) (8)L(yi,^) =L(5{®yi,3C® 3]. 

When the tensor product is extended to operators it behaves in the same way as it does 
on vectors. For A G L( J{] with elements atj for 1 ^ i, j ^ d and B e L(DC], the operator 
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A ® B has matrix representation given by the block matrix 



A® B = 



/ aiiB ai2B 
a2iB CI22B 

^QdiB ad2B 



aidB\ 
a2dB 

addB/ 



As L['K] is itself a Hilbert space, we can find bases of operators for it. An important 
orthogonal basis for this space is given by the discrete Weyl operators, also known 
as the generalized Pauli operators. These operators extend the Pauli operators to the 
d dimensional space Oi, keeping the properties of orthogonality and unitarity, but 
losing Hermiticity. As these operators will be essential to several of the arguments 
in the thesis, they are introduced in detail. The discrete Weyl operators are based on 
generalizations of the X and Z operations, given by 

d 

X = ^|j + l)(j| 
j=i 

d 

Z = ^<|j)(j|, 

j=l 

where cUd is a d-th primitive root of unity (such as e^^"/*^), and in the definition of X 
the operator |d + 1) (d| is taken to be |1) (d|. The operator X simply advances each state 
of the computational basis to the next, and the operator Z applies a different phase to 
each basis state. It is clear from the definition that XX* = I^k = ZZ*, which implies 
that these operators are unitary. It is also clear that X and Z fail to commute: 

ZX = cUdXZ. (1.5) 



Using these operators, the discrete Weyl operator with index (a, b)GZdxZdis given 
by 



W, 



a -yb 



X'^Z 



For two dimensional systems, these operators are, up to phases, exactly the usual Pauli 
matrices. These operators are unitary, since they are products of the unitary operators 



X and Z. Equation 1.5 can be directly extended to these operators to obtain 



Wa,bWe,f = X'^Z^X'^Z^ = cul^^-^^^X'^Z^X'^Z^ = a}l^-''^We,iWa,b. 



(1.6) 
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To see that these operators form an orthogonal basis for L(IK), notice that, by the cyclic 
property of the trace 

difa^e and b = f , 
tr W;t,We,f = irZ-^X-^'X^Z^ = trX'^-'^Z^"^ = <^ (1.7) 

1 otherwise. 

These operators can be normalized to obtain an orthonormal basis for \.['K], but this 
comes at the cost of unitarity, so we will not do this here. The discrete Weyl operators 
will be used in Chapter [7| to show that several transformations on quantum states can 
be realized as convex mixtures of unitary transformations. 

A linear operator A G L(!K) is normal if A* A = AA*. By definition the Hermitian 
and unitary operators are normal. Any normal operator A G L(CK) has a spectral 
decomposition, which is a representation as 

A = ^ At|4)t)((})i|, (1.8) 

i 

where the vectors {Icj^t)}, called the eigenvectors of A, are an orthonormal basis for %. 
The associated complex numbers Ai are called the eigenvalues of A. The space spanned 
by the eigenvectors of A corresponding to nonzero eigenvalues is called the support of 
A. This space has dimension equal to the rank of A. 

The classes of operators that we have previously encountered can be characterized 
in terms of the spectral decomposition. A normal matrix U is unitary if and only if all 
of its eigenvalues are have norm one, i.e. if |At| = 1 for all i. This also implies that U is 
full rank. A normal operator is Hermitian if and only if all of its eigenvalues are real. 



This can be seen by considering the adjoint of the representation in Equation (1.8). As 
a further restriction, an operator is positive if and only if all of it is normal and has only 
nonnegative real eigenvalues. In addition to this, if A is an operator with eigenvalues 
Ai, then the trace of A is given by tr A = Ai. This characterization of the trace is 
extremely useful. 

The spectral decomposition also allows functions on the complex numbers to be 
extended to operators in L(J{). An example of this is square root of a positive ma- 
trix. This is defined, for any positive operator A, by taking the square roots of the 
eigenvalues, i.e. 

^/X = Y_ V^|cl)i)(ct)il, 

i 

where A has spectral decomposition A = ^i Ai|(j)i)(4)i|. It is easy to see from this 
definition that the operator square root satisfies \/A \/A = A. It is less obvious that 
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\fA. defined in this way is the unique square root of the operator A, but this is indeed 
the case. This technique can be extended to define log( A) for a positive matrix A as 
well as and | A| for general normal matrices. 

The absolute value of an operator has a different definition when A G L(IK, is 
not normal (and potentially not square). In this case, |A| = VA*A, relying on the 
fact that for any operator A, the operator A* A is positive. The eigenvalues of 1A| 
play a central role in another important decomposition of an operator. The singular 
value decomposition of A G L(IK, 3C) gives a representation of A that mimics the spectral 
decomposition, but exists even when A is not normal. This representation is 

d 

A = _^st|(t)t)(il;tl 

i=l 

where d = mtn{dim dim %]. The values Si are nonnegative and real, these are called 
the singular values of A. They are equal to the eigenvalues of | A|. The vectors {|cl)t)} and 
{|i|>i)} form orthogonal sets in % and %, respectively. The singular values will be very 
important in Chapter |3] where we consider a collection of operator norms that depend 
solely on them. 



1.2,4 Mixed states 

Pure states suffice to model the behaviour of a quantum system in a known state, but 
they do not completely capture the situation when there is uncertainty about exactly 
which state a system is in. As an example, if the state a two-dimensional system is |0) 
or |1) each with probability one-half, then the behaviour of the system is identical to 
one in which the state is a uniform mixture of |+) or |— ), yet these two descriptions 
differ. 

This problem is resolved by resorting to density operators. Given a system that is 
in the state with probability pt (this is called the ensemble {(p^, the density 

operator associated with the system is given by 

p = _^Pil4)i)(4>il- 

i 

A given density matrix may, in general, have an infinite set of ensembles that generate 
it. This notation resolves the earlier example, since we have, for any two orthonormal 
bases and of "K 

i i 
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where the symbol i^^ denotes l^^/ dim IK, the normalized identity operator on IK. An 
element p e L(IK) is called a density operator (or equivalently a density matrix) if and 
only if it satisfies the two properties 

1. p is positive, 

2. trp = l. 

The set of all such density operators on 'K is denoted D( 

The density operators are also referred to as the mixed states, as they provide a 
complete description of a quantum system. Pure states also fit into this framework: 
the state 14)) corresponds to the density operator |c|))((|)|, and these two notions of state 
will be used interchangeably in this case. Notice also that the set of density matrices 
D([K) is both compact and convex. The extreme points of this set are simply the rank 
one projectors 14)) (cW corresponding to pure states in 'K. 

A mixed state in D(IK (g) X) is called separable if it is the convex combination of a 
set of separable states m!K(S>OC. If a mixed state cannot be decomposed in this way, 
i.e. any ensemble contains an entangled pure state, then it is called entangled. 

1.2.5 State evolution and measurement 

The evolution of a quantum system in the state p e D(IK) is determined by the action 
of a unitary operator U e U(IK). The state of the system after this evolution is UpU*. 
As we shall see in the next section, this does not capture every quantum process, but it 
is an important special case. As = U* is also unitary, this implies that any unitary 
evolution is, in principle, reversible. 

Measurements provide a method for retrieving information from a quantum sys- 
tem. The simplest case of measurement is given by a projective measurement, which is 
a set {TTi} of orthogonal projectors in L{'K) with the property that TTt = I^k- When 
this measurement is performed on a state p e D([K) the outcome is i with probabil- 
ity Pi = tr(nip) and the state after measurement is (TTipTTij/pi. In the case that the 
outcome of the measurement is unknown, i.e. it is discarded or forgotten, the result- 
ing state is given by X^TIipTIi, i.e. the weighted mixture of all of the measurement 
outcomes. 

There is a special case of projective measurement that is of particular importance in 
quantum computing. This is known as measurement in the computational basis, which 
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is given by the complete set of projectors (i| : ^ i < dim J-C}. Any measurement 
using orthogonal rank one projectors can be derived from this measurement by rotating 
the state p to be measured using some unitary operation U. 

Projective measurements are not the only case allowed by quantum mechanics. 
More generally, a POVM measurement is given by a set {E^} of positive operators 
that sum to tji. The outcome of such a measurement is i with probability pi = 
tr(Eip). While the state after measurement can be defined as in the case of projective 
measurements, a simpler model will suffice for the results in this thesis. In this 
model the outcome after measurement is the state |i) when the result is i. This form of 
measurement can be quite convenient to work with. POVM measurements can always 
be realized by projective measurements on the state p ® |0) (0| in a larger Hilbert space. 
This result is known as Naimark's theorem IINeu43i . 



1.2.6 Channels 

We have already seen two types of evolution for quantum states: unitary evolution and 
measurement. Both of these types of evolution are special cases of the most general 
type of transformation on quantum states. These are the linear transformations that 
map density matrices to density matrices, known as quantum channels. Such a map 
can capture any process allowed by quantum mechanics. These maps can also be 
characterized as the linear operators O from L( J{) to L[X) that satisfy two properties 

Trace preserving: tr (D(X) = tr X for all X G L(:K) 
Complete positivity: If X ^ then (O ® I^c) (X) ^ for all X, X G L( J{ ® X). 

In the above definition, I^ is the identity transformation on L(3C) and the map (I> ^ 
is simply the map that applies O and W on their respective Hilbert spaces. The set of 
all channels from L(CK) to L(IK) is denoted T{^,X). A linear map taking L{J-C)toL{X) 
that is not necessarily a channel will occasionally be referred to as a super-operator. 

One of the most important quantum channels is the operation known as the partial 
trace. This is the channel in T(!K X,^) that traces out the system in the space X. 
This map is defined on X ® Y with X G L( J{) and Y G L{X] by 

tr3^X®Y= (trY)X, 

and extended to the whole space L(CK ® 3C) by linearity. This is the operation that 
discards the system in X, and as such, is very useful in quantum information. The 
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partial trace can also be expressed by explicitly writing out the trace over X. Let [\4>i)] 
be any orthonormal basis for %, then for any X G L(3-C ® X), the partial trace over % is 

tracX = _^(1:k ® (4)il)X(l5c ® 14)i)). 

i 

One important feature of mixed states is that they can always be viewed as part of 
a pure state on a larger Hilbert space. Any p G D(!K) can be expressed as a pure state 
14)) G !K 3C, where dim % ^ rank p, as 

p = trjc I4>)(4>l- 

The state 14)) is referred to as a purification of p. These purifications will form an 
integral part of many of the proof techniques used in this thesis. It is also important 
that any two purifications |4'), \^) E ®Xoia state p G D([K) are related by a unitary 
operation on the space X alone, i.e. there exists a U G \J{X) such that 

{1^,®UM) = \A>). 

This fact will be used in the definition of the fidelity in Chapter |3| 

There are two convenient representations of quantum channels that will be needed. 
The first of these is the representation of a completely positive map (D by a set of Kraus 
operators, which are matrices At such that 

(D(X] = ^ AtXA^. 



This representation is due to Choi IICho75L If, in addition, O is trace preserving, then 



the operators At satisfy the property 

^ A?At = 1. 

i 

If O G T( J{, 3C) then the number of Kraus operators in a minimal Kraus decomposition 
is at most (dim "K] (dim X). 

The second representation of importance is known as the Stinespring Dilation 
Theorem, after the 1955 work of Stinespring ||Sti55L though the precise statement of 



the result we use here is given by Hellwig and Kraus IIHK70I . This theorem states 
that any quantum channel can be represented as a unitary operation on a larger space, 
some of which is traced out. More formally, for a channel O G T( J{, 3C) there are spaces 
AS andaU G \J {^K ^ A, X ® 'B) ^ V{:K®A] such that 

(D(X) =tr23U(X® |0)(0|)U*. 



17 



Where A can be chosen so that dim A ^ dim "K dim %. Such a representation is unique 
up to an isometry on the space that is traced out. This representation can be used to 
recover a Kraus representation: see |Sch96[ for an overview of this result. 

The Stinespring representation implies that in order to model a quantum channel, 
we need worry about only three parts: introducing ancillary qubits in a known pure 
state, implementing unitary operations, and implementing the partial trace. This will 
be extremely helpful for the reductions in this thesis that seek to simulate general 
quantum channels with channels from restricted classes. More details can be found in 
Section 2.1 where the model of circuits used in the thesis is formally defined. 



1.3 Classes of quantum channels 

This section provides an overview of the different classes of quantum channels that 
will be encountered in this thesis. This overview will be kept somewhat brief, as the 
classes that will receive detailed treatment are reintroduced more thoroughly in the 
chapters where they appear. 

The classes of channels presented here place different restrictions on the set of 
channels. Some of these restrictions come from practical notions, like the channels that 
can be implemented in a small amount of time, and some of the restrictions come from 
more theoretical concerns, such as the antidegradable channels. The restricted classes 
studied here are largely incomparable: this is because one of the aims of this thesis is 
to present simplified versions of the distinguishability problem that are nevertheless 
just as hard as the general case. In order that these results cover more of the quantum 
channels that are likely to arise in practice, it is helpful if the distinguishability problem 
is shown to be hard on several unrelated classes of channels. 

The material that appears in this section makes several references to the material 
that follows in Chapter |3j Because this material is only used in a superficial way in 
this section, it is not necessary to have read Chapter |3] first, though a familiarity with 
quantum information will help. 



1.3.1 Circuit restrictions 

Quantum circuits are a convenient way to provide a quantum channel as input to 
a computational problem, such as the problem of distinguishing quantum channels. 
One of the advantages of this representation is that, given a quantum computer, it 
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allows the channel to be evaluated, but it remains computationally infeasible to find a 
matrix representation for all but the simplest channels. The circuit model used here is 
presented in detail in Section [ZTj This level of detail will not be necessary to introduce 
the classes of channels defined by placing restrictions on this model. 

The circuit representation for quantum channels allows restricted classes of chan- 
nels to be defined by placing restrictions on the types of circuits that are allowed. 
These channels can be much simpler than general channels. An example of this is the 
class of channels with stabilizer circuits, which are the circuits defined on a restricted 
set of quantum gates. Given such a circuit, a channel can be efficiently simulated using 
a deterministic classical computer IIAG04I , but it is expected that this is not possible 



for quantum channels given as general quantum circuits, as this would imply the 
equivalence of classical and quantum computation. 

Restricting the input circuits to the distinguishability problem mentioned in Sec- 
tion 1.1 can lead to simpler variants of the problem. One such restriction is to the class 
of channels that implement unitary operations. These channels can be obtained as a 
circuit restriction by eliminating the non-unitary gates from the circuit model. Distin- 



guishing these circuits appears to be easier than general mixed-state circuits | UWB05| | 



One of the more interesting circuit restrictions is the requirement that the input 
circuits to the distinguishability problem have depth logarithmic in the number of 
input qubits. These are the circuits that can be performed in logarithmic time with 
a parallel model of quantum computing. Such as model not unreasonable in many 
implementation schemes for quantum computing. These circuits are interesting as 
they limit the length of time that quantum information needs to be protected from 
decoherence. For this reason, much of experimental quantum computing is concerned 
with very short computations, and log depth circuits are an interesting generalization 
of such computations. Many important quantum algorithms are known to have log 



depth circuits, such as the approximate quantum Fourier transform HCWOOII and the 



encoding and decoding operations for many quantum error correcting codes IIMN02I . 



One of the results on the thesis is that distinguishing log depth quantum mixed- 
state circuits is complete for QIP, i.e. just as hard as the general case. This is shown 
by reducing the close images problem, studied in Chapter |4| to a log depth version of 
itself. The essential idea behind this reduction is to simulate a general quantum circuit 
by a log-depth one by slicing the circuit into log depth pieces that are performed in 
parallel. This circuit will perform the same computation as the original circuit only if 
the input to one piece matches the output of the previous piece. To ensure that this is 
the case for the circuits constructed in the reduction, tests are applied to force this to be 
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the case for any outputs of the two circuits that can potentially have close images. This 
reduction shows that the close images problem remains QlP-complete when restricted 
to log depth circuits. Extending this to the distinguishability problem on log-depth 
circuits follows from the fact that the reduction in Chapter |5] preserves the log-depth 
restriction of the circuits. 



1.3.2 Degradable and antidegradable channels 

The degradable channels are those channels O for which there exists a second channel 
that maps the output of O to the environment of O, i.e. the space that is traced 
out in a Stinespring representation. These channels were introduced by Shor and 
Devetak IIDS05I and can be thought of as the channels where the environment contains 
no information that is not also present in the output of the channel. A more formal 
definition is: a charmel <1> G T(J{,3C) expressed as <l>(p) = tra3U(p |0)(0|)U* is 
degradable if there exists a channel D such that 

tr3,U(p®|0)(0|)U*=D((D(p)). 

Stinespring representations are not unique, but any two differ by an isometry on the en- 
vironment space, and this isometry can be absorbed into the channel D, which implies 
that the notion of degradability does not depend on the choice of representation. 

The channel given by tracing out the output space % and not the environment 
space T) is often referred to as the complementary (or conjugate) channel to O, with 
the caveat that it is only defined up to the Stinespring representation chosen for O. A 
channel is antidegradable if the complementary channel is degradable. More plainly, a 
channel O is antidegradable if there exists a second channel that maps the environment 
of O to the output of O. Antidegradable channels are also well-defined, since as in 
the case of degradability, the choice of Stinespring representation can be absorbed 
into the degrading map. A thorough discussion of the degradable and antidegradable 
charmels can be found in IICRS08I . 

These channels are discussed in Chapter [6| where it is shown that the problem of 
computationally distinguishing two channels is made no easier when the channels 
are promised to be degradable or antidegradable. This is done using a construction 
similar to one found in IICRS08I that is used to reduce the additivity of the classical 
capacity to the degradable case. 
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1.3.3 Entanglement-breaking channels 



An entanglement-breaking channel O is a channel for which the output (O I^k) (p) is 
separable for any input state p. This class of channels contains many of the commonly 
used channels, such as the completely depolarizing channel and the complete dephas- 
ing channel. It is helpful to state a few alternate characterizations of the entanglement- 
breaking channels. 



Proposition 1.2 (Horodecki, Shor, and Ruskai ||HSR031). LetO e T['K,X]. The following 
are equivalent 

1. (D is entanglement-breaking, 

2. (O ® Ijf ] {M) is separable, for a maximally entangled state on'K®'K, 

3. <t> has a Kraus decomposition using only rank one operators, 

4. O can be written as 

0{p) =_^aktr(Ekp], 

where the cr^ density matrices and the set {E^} forms a POVM. 

Another property of these channels is that all entanglement-breaking channels are 
antidegradable llCRSOSL 

The distinguishability problem on quantum circuits, considered in Chapter |5| is 
defined in terms of distinguishability with access to a reference system. This method, 
distinguishing channels by observing their action on part of a larger space, is the most 
general method for distinguishing channels. There are channels known for which 



this reference system is required to obtain an optimal distinguishing strategy HWatOSI . 
This reference system allows for entangled inputs to aid in distinguishing the two 
channels and it appears to be essential to problem. It might be expected that this 
entanglement cannot help distinguish entanglement-breaking channels as the output 
is always separable, but this is not the case. An example on qubit channels has been 



provided by Sacchi ISacOSal ISacOSbi . When this example is generalized to channels 
on a d-dimensional space, however, the amount that this entanglement assists the 
distinguishability goes to zero quadratically with d. This is in contrast to the large 
difference that this reference system can make in the distinguishability of general chan- 
nels. Examples of entanglement-breaking channels with this property are not known. 
Whether or not this is a roadblock to extending the hardness of the distinguishability 
problem to these channels is an interesting open problem. 
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It is simple to show that the problem of distinguishing two channels is QlP-hard for 
entanglement-breaking channels that are exponentially close together using a straight- 
forward reduction from the general problem. This can be achieved by simply mixing 
the channels in a given instance of distinguishability with enough of the completely 
depolarizing channel that the resulting channels are entanglement-breaking. That this 
occurs is a consequence of the fact that there exists a ball of separable states around the 
completely mixed state |GB02| . Unfortunately this ball has radius that is exponentially 
small in the log of the dimension (i.e. the number of qubits in the original circuits), 
and so the resulting entanglement-breaking channels are exponentially close together. 
The polarization technique that can be applied in the general case, which is discussed 
in Section [3^ cannot be applied to these circuits as they are too close together, and so 
this reduction can only be used to show the hardness of distinguishing circuits that 
are exponentially close together, which is perhaps not a terribly surprising result. 



1.3.4 Unital channels 

A superoperator O: L['K) L[%] is Mnzta/ if 0(1^^) = tx- The unital channels are 
often called doubly stochastic as in addition to being unital they are also trace preserving. 
The trace preserving property of quantum channels requires that any unital channel 
have input and output spaces of the same dimension. The unital channels have the 
interesting property that the entropy of the output of the channel is always at least at 
large as the entropy of the input, as noted by King and Ruskai [iKROlj . 

This property makes the unital channels interesting from the perspective of the 
additivity of the Holevo capacity, as channels that do not reduce entropy can be used 
as a natural noise model. Fukuda has shown how to construct a unital channel from 
a general channel, without changing the minimum output entropy or the maximum 



output p-norm [|Fuk07| . This implies that for a specific class of channels the question 
of additivity can be restricted to a subclass of the unital channels. 

Mendl and Wolf have recently characterized the unital channels as the quantum 
channels that can be decomposed into the affine combination of a set of unitary chan- 



nels |MW09| . More explicitly, they have shown that a channel (D is unital if and only 



if there exist unitaries lit and G IR with At = 1 such that 

0(X) = ^AiUiXU^. 



i=l 



The form of this decomposition is very similar to the next class of channels that we 
consider. 
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1.3.5 Mixed-unitary channels 



A quantum channel is mixed-unitary if it can be decomposed into the probabilistic 
application of a set of unitary operations. These channels are often referred to as 
the random unitary channels, but this is avoided here because this name often causes 
confusion with the channels defined by drawing unitary operators from the Haar 
measure. More formally, O is mixed-unitary if there exist unitary operators Ui, . . . , Xi^ 
and a probability distribution pi, . . . , such that 

n 

a)(x) = }^PtU,xu*. (1.9) 



It has been shown by Gregoratti and Werner |GW03| that the mixed-unitary channels 
describe exactly the noise processes that can be corrected using classical information 
obtained by measuring the environment. Audenaert and Scheel have recently pro- 
vided necessary and sufficient conditions for a channel to be mixed-unitary II AS08I . 
Buscemi has also provided an upper bound on the number of unitaries needed for a 



mixed-unitary decomposition |Bus06| 



The set of mixed-unitary channels is contained in the set of all unital channels; this 



is a simple consequence of Equation \1.9) . For channels on qubits these two sets of 



channels coincide, but for larger dimensions this is not the case iTre86[ IKM871 ILS93I 



It is known, however, that there exists in the set of unital channels a ball of mixed- 
unitary channels around the completely depolarizing channel BWat09al , which is the 
channel that maps all input states to the completely mixed state. In the case of qubit 
mixed-unitary channels, both additivity of the Holevo capacity and multiplicativity or 
the maximum output p-norm are known to hold ['Kin02|. For general mixed-unitary 
channels, both additivity l|Has09l and multiplicativity [HW08J are known to fail. These 
properties are considered in more detail in Chapter |3} 

The fact that these properties do not hold in general for mixed-unitary channels 
does not completely eliminate interest in the additivity properties of specific mixed- 
unitary channels. One of the contributions of this thesis is a method to approximate a 
general quantum channel with a mixed-unitary one. This approximation can be made 
arbitrarily good in the minimum output entropy or the maximum output p-norm by 
increasing the dimension of the ancillary space used by the approximation. This can be 
used to show that the additivity and multiplicativity problems for a general channel 
can be reduced to the same problem on a mixed-unitary approximation, where the 
approximation error can be made arbitrarily small. Results on this approximation 
may be easier to prove; mixed-unitary channels have been essential to finding coun- 
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terexamples to both the additivity and multiplicativity conjectures. These results can 
then be applied to the original channel by sending the approximation error to zero. 

The method for approximating a general channel by a mixed-unitary one is dis- 
cussed in Chapter |7j Starting with a channel in Stinespring form 0(X) = tr-s U(X ® 
|0) (0|)U* there are only two operations that are not mixed-unitary: the partial trace on 
the system S and the introduction of the auxiliary system in the |0) state. The partial 
trace is the easy operation to simulate with a mixed-unitary channel as it may be di- 
rectly replaced by the completely depolarizing channel on the space B. The auxiliary 
system in the |0) state is more difficult to replace. The strategy employed is to add this 
extra system to the input space of the channel and test that the input in this space is 
close to |0). If this auxiliary input is close to |0) the channel proceeds exactly as does the 
original channel. If the auxiliary input is far from |0) then the testing procedure sends 
the input state very close to the maximally mixed state, which results in the output of 
the channel having very high entropy. As we are concerned with approximating the 
minimum output entropy this ensures that any input state achieving the minimum 
is very close to |0) in the auxiliary space. This construction produces a channel with 
similar minimum output entropy and maximum output p-norm to the original chan- 
nel, and so it can be used to reduce problems of additivity and multiplicativity to the 
mixed-unitary case. 

This construction can also be performed on circuits in time polynomial in the 
size of the circuit and so it also has implications for the problem of distinguishing 
quantum circuits. This can be used to show that the problem of distinguishing two 
mixed-unitary circuits is as hard as distinguishing two general circuits, which is a 
QlP-complete problem. This result is found in Chapter |7[ 
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Chapter 2 

Quantum Computational Complexity 



This chapter lays the complexity theoretic groundwork for the remainder of the thesis. 
This includes a definition of the circuit model that is used throughout the thesis as well 
as a brief overview of some of the complexity classes that will be encountered later. 

The circuit model used here is the mixed-state circuit model of Aharonov et 
al. IIAKN98I that allows measurements and other non-unitary operatations to take 
place during a computation. This model will be essential to the problems considered 
in the thesis: the distinguishability problem appears to be strictly more difficult on this 
circuit model than it is on the model of unitary circuits. This is despite the fact that 
both of these models are computationally equivalent, in the sense that any problem 
solvable by a circuit in one model can also be solved in the other. This equivalence 
does not extend to problems that take these circuits as input. 

The wide array of complexity classes often encountered in theoretical computer 
science is not particularly useful or relevant to the results of the thesis. For this reason, 
the introduction of complexity classes is kept quite brief, with only a few of the most 
important classes introduced. 
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2.1 Quantum circuits 



Many questions on quantum channels can be extended to computational problems. 
This extension leaves one difficulty: what is the correct way to encode a quantum chan- 
nel as input to a computational problem? One obvious choice is to provide the Kraus 
operators or the unitary matrix from a Stinespring dilation. Such a representation 
allows for any quantum channel to be represented approximately, as these matrices 
can only be specified up to some precision. Viewed computationally, however, this 
representation is unsatisfying. The reason for this is that the description of the chan- 
nel is polynomial in the input and output dimensions, which are exponential in the 
number of qubits needed to represent the input and output. This representation is 
similar to modelling any classical process as a table of inputs and outputs - this form 
is convenient, but often exponentially larger than necessary. 

Taking a hint form classical complexity theory, we will represent quantum channels 
using circuits. These circuits will allow for the simulation of a quantum channel from 
the circuit description, but they will not, in general, allow the efficient solution to 
most of the computational problems on these channels. This is equivalent to the 
classical case, where the circuit satisfiability problem is used to represent the problem 
of determining if a computation can be made to accept. Providing a complete table 
of outputs as the input to this problem trivializes it, as in the case of a polynomial 
size circuit, the table encodes the information in the circuit in an exponentially larger 
description. The problems on quantum computations that we consider in this thesis 
are similarly trivialized by a representation of quantum channels as matrices of size 
exponential in the number of input and output qubits. 

The most widely used model of quantum computation is the unitary circuit model. 
In this model a computation is represented by a directed acyclic graph, where the edges 
represent qubits and the nodes represent gates. In order for a circuit to implement 
a valid quantum operation, each gate is labelled with a quantum channel that maps 
the state of the input qubits to the state of the output qubits. The operations that can 
appear as gates in a circuit depend on exactly which model of quantum computation 
is being used. As one final restriction, no isolated vertices in the graph are allowed, 
because these would correspond to gates in the circuit that neither take input nor 
produce output, and so they cannot affect the computation being performed. 

There are two important quantities related to a circuit: size and depth. If a circuit is 
represented as a graph, the size of the circuit is the number of vertices, i.e. the number 
of gates in the circuit. This definition leaves the possibility of very small circuits acting 
on a large number of input qubits - this undesirable feature is avoided by taking 



26 





a 




b 







Figure 2.1: An example quantum circuit. 

the size of a circuit to be the maximum of the number of gates and the number of 
qubits that the circuit acts on. Using this definition, the size of a circuit is essentially 
equivalent to the number of bits needed to represent the circuit, so long as the number 
of different types of gates available is constant. 

The depth of a circuit is the length of the longest directed path in the graph. As 
circuits are acyclic, the depth of a circuit can be efficiently computed from a description. 
Since the transformations implemented by gates acting on different qubits commute, 
they can be performed in parallel. This implies that the depth of a circuit represents 
essentially the minimum amount of time used by an implementation of the circuit, 
provided that gates acting on disjoint sets of qubits can be performed in parallel. 



As an example of size and depth, the circuit in Figure 2.1 takes four qubits as input, 
produces two qubits as output, and has size four and depth two. In this figure, and 
in all the circuit diagrams that will appear in the thesis, the gates in the circuit are 
represented by boxes and the edges by the lines connecting them. The edges in circuits 
are directed, but by convention, the edges in the diagrams appearing here are always 
directed from left to right, so that time flows left to right during the evaluation of 
the circuit. The circuit in the example maps four qubits to two qubits. This can be 
thought of as a channel (D e T(J{,DC), where dim J{ = 2^ = 16 and dimX = 2^ = 4. 
An alternate view of this circuit is, where A is a Hilbert space of dimension two, as a 
transformation in T(yi®'*,yi®^). We will take whichever view is most convenient, as 
these two sets of transformations are isomorphic. 

As a further notational convenience, throughout the thesis, a circuit C will be 
identified with the transformation C G that it implements, so that for a state 

p G D( the state C(p) is the output of the circuit when executed on the input state 
p. Each circuit specifies exactly one transformation, but the converse is not true: any 
quantum channel has (infinitely) many circuit implementations, and so given only a 
transformation, a circuit implementing it will need to be carefully constructed. In most 
of the cases we will encounter this is not difficult to do. 
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Circuits as a model for quantum computation are significantly easier to work with 
than earlier models of computation, such as the model of quantum Turing machines 
introduced in | |BV97| . When the transformations that can be used as gates are restricted 
to the right set, it is known that the circuit model of computation is equivalent to the 
quantum Turing machine model l|Yao93| . For this reason, we will use the circuit model 
of quantum computation, though for most of the results in this thesis the exact model 
of computation will not be important. 



2.1.1 Unitary circuits 

The most commonly used model of quantum circuits is the unitary circuit model. 
In this model every gate implements a unitary transformation on one or two qubits. 
Unitarity implies that all the gates of this model have the same number of input and 
output qubits. 

It is known that any unitary operation can be approximately represented using 
only a finite set of one- and two-qubit unitary gates. The set of gates we will use is 



given in Figure 2.2 and the proof that it is (approximately) universal is due to Boykin 
et al. [|BMP"^00| . Different universality proofs for slightly different sets of gates can be 
found in IISho96[ IKLZ981 lABOOSi . An excellent overview of this and other universal 
sets of gates, as weU a proof that the gate set used here is universal can be found 
in IINCOOI . 

We have seen a few of these gates before: the Pauli X and Z and Hadamard gates 
simply apply the corresponding unitary operators to the qubits they act on, where the 



operators X, Z, and H are as defined in Section 1.2 A few of these operators are new. 



In matrix form, the swap, controUed-not (CNOT), and tt/S (T) gates are given by 
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For the sake of convenience we have added a few gates to the circuit model. The Pauli 
X and Z and swap gates are not needed for a universal set of gates. They can, however, 
be constructed exactly using gates from the standard set. Two of these gates are simple 
to build from the standard set 



Z = T^ 



X = HZH = HT^H. 



28 



X 



H 



T 



Pauli X gate 



Pauli Z gate 



Hadamard gate 



7t/8 gate 



Two-qubit swap gate 
Controlled-not gate 



Figure 2.2: Gates in the unitary circuit model. The Pauli X and Z gates, and the swap 
gate are not required for universality, but they are included for convenience. The n/S 
gate is needed for universality, but will not be used in any of the circuits constructed 
outside of this section. 



Figure 2.3: Simulation of the swap gate with three controlled-not gates. 



The third unnecessary gate, the swap gate, can be implemented in the standard model 
using no gates at all! This is because the unitary operation that swaps two qubits can 
be introduced into a circuit by simply redirecting the edges in the underlying graph. 
In many practical models of computation it is nontrivial to connect gates together in 
arbitrary directed graphs. One such model is the nearest-neighbour model, where a 
qubit can only interact with the qubits immediately adjacent to it. This model (with 
polynomial depth and size overhead) can simulate the more permissive model if the 
swap gate included in the circuit model, since the required qubits for any operation 
can always be swapped together. The swap gate can be implemented as a series of 



three controlled-not gates in this model, as shown in Figure 2.3 We will use W to 



represent the gate that swaps the input systems, so that W|a)|b) = |b)|a) even when 
the dimension of the systems to be swapped is larger than two. This gate can be 
implemented using several two-qubit swap gates. The introduction of these gates 
does not change the circuit model, as they can be exactly implemented in the model of 



Boykin et al. |BMP"'"00| using a constant number of gates. 
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Figure 2.4: ControUed-U gate. 



It is often very useful in a quantum circuit to control the application of some 
unitary operation based on the value of an additional qubit. For a unitary U, this is 
the operation commonly known as a controlled-U gate. We have already encountered 
one such gate: the controUed-not gate in the standard model is exactly the controlled 
application of the Pauli X gate. Given a unitary operation U (as a circuit), the controlled- 
U operation is the unitary operation that applies U if the control qubit is |1) and does 
nothing if the control qubit is in the |0) state. The representation of this gate in the 



circuit model is shown in Figure 2.4 For a unitary U G U(IK], this gate is represented 
in block matrix form as 



A(U) = 



u 



Given a circuit for U, it is simple to construct one for the controlled-U operation. 
Each gate in the circuit can be replaced by a controlled version, so that all of the 
gates are applied (i.e. U is applied) or none of the gates are applied. The controlled 
versions of each gate in the basis need to be constructed, but these are guaranteed to 
(approximately) exist by the fact that we are using a complete basis of gates. Notice, 
however, that this construction may add significantly to the depth of the circuit: the 
single control qubit is used many times. A more depth-conscious construction is 



presented in Section 2.1.3 



2.1.2 Mixed-state circuits 

Circuits in the unitary model can (approximately) represent any unitary computation. 
If these circuits are allowed access to ancillary qubits in a known pure state, then 
they can perform any efficient quantum computation. There is, however, a drawback 
to this model: unitary circuits cannot simulate an arbitrary quantum channel. This 
is because a general completely positive and trace preserving operation may discard 
information, it may make measurements in the middle of a computation, or it may 
introduce ancillary qubits in a mixed state. Many of these operations are impossible 
to implement in the unitary model. 
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Figure 2.5: Non-unitary gates in the mixed state circuit model. 
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Figure 2.6: Simulations of the measurement gate, the completely dephasing gate, and 
the completely depolarizing gate with the other gates in the circuit model. 



For this reason we will use the mixed-state circuit model of Aharonov, Kitaev, and 
Nisan IIAKN98P . Circuits in this model can (approximately) represent any quantum 
channel. This can be thought of as a probabilistic model of quantum computation, 
as the state of the computation can be a mixed state, whereas the unitary model can 
be thought of as deterministic computation, since the state during the computation is 
always pure. The gates available in this model are the standard gates from the unitary 



model, as well as the additional gates shown in Figure 2.5 



As in the unitary model, we have included a few unnecessary gates for the sake of 
convenience. The only two gates that are actually required are the gate that introduces 
ancillary qubits in the |0) state and the gate that traces out a qubit. The gate that makes 
a measurement in the computational basis can be implemented using an ancillary qubit 
and a controlled not gate, as shown in Figure [Z6} The output of this measurement gate 
is viewed as a mixed quantum state in the following way. If the measurement outcome 
is with probability p and 1 with probability 1 — p, the outcome of the measurement 
gate is p|0) (0| + (1 — p)|l) This density matrix is diagonal, and so it may be thought 
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Figure 2.7: The unitary operation U simulates the admissible operation O. 



of as a classical probability distribution, but there is no loss in generality in encoding 
this distribution as a mixed quantum state. Controlled operations will function in 
exactly the same way regardless of whether they are controlled classically or by the 
density matrix corresponding to the same probability distribution. This measurement 
gate, as it turns out, performs exactly the same transformation as the decoherence gate 



D included in the gate set. The decoherence gate in Figure 2.5 applies to systems of 



an arbitrary number of qubits, but this operation has the property that when applied 
individually to a number of qubits, the result is exactly the same as if it had been 
applied to all of them at once. The completely depolarizing channel also has this 
property, and so it is sufficient to give an implementation for the channels acting only 



on a single qubit, as is done in Figure 2.6 



Since the standard model of quantum computation is inherently probabilistic, as 



we will see in Section 2.2 it is not hard to show that the mixed-state model is equivalent 



in computational power to the unitary circuit model |AKN98[ . The central idea behind 
the equivalence of the unitary and mixed-state models is the fact that any quantum 
channel can be implemented in Stinespring form, which is the introduction of ancillary 
qubits first, followed by a unitary operation on the now larger space, and finally tracing 
out any qubits that are not part of the output. For a channel O G T( J{, %), this is exactly 
the Stinespring representation 

(D(p) =tra3U(p® |0)(0|)U* 

where the unitary U is implemented by a unitary circuit. An example of this is 



illustrated in Figure 2.7 This equivalence is noted in IIAKN98I , making use of what is 
known about quantum channels in the physics literature | |Sti55[|HK70 |. 



Despite this computational equivalence these two models are not identical. The 
distinguishability problem discussed in Chapter|5]seems to be significantly harder than 
the distinguishability problem for unitary computations. If this is not the case, there 
are unexpected consequences in complexity theory | |Vya03[ . Even stronger evidence 
is provided by the close images problem studied in Chapter |4| This problem involves 
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determining the distance between the images of two transformations. If these trans- 
formations are unitary, their images always intersect, rendering the problem trivial. 
For these reasons, the standard model of quantum computations used throughout the 
thesis is the mixed-state circuit model. Any quantum channel that is given as the input 
of a computational problem will be in the form of a classical description of a circuit in 
this model. 



2.1.3 Short quantum circuits 

A significant challenge in the experimental realization of a quantum computation 
is the need to keep a quantum system from interacting with the environment. The 
decoherence caused by these interactions in practice provides a time limit for the 
computation. One way to ameliorate this difficulty is to find low-depth circuits that 
solve the problems we are interested in. 

Short quantum circuits have been found for several important problems, such as 



the approximate quantum Fourier transform HCWOOI and encoding and decoding op 



erations for many error correcting codes IIMN02L These examples show the significant 



power of circuits that have depth logarithmic in the size of the circuits. More evi- 



dence for the power of short circuits is provided by Terhal and DiVincenzo |TD04| 



and improved by Fenner et al. |FGHZ05| who show that exactly computing the ac- 



ceptance probabilities for constant-depth quantum circuits is as hard as simulating 
general quantum computation. Fenner et al. also show that, under certain restrictions, 
the acceptance probabilities for these circuits can be efficiently approximated. 

The purpose of this section is to give a construction for the controlled version of a 
log-depth circuit on n qubits that results in a depth O(logn) circuit. It is not immedi- 
ately clear how a controlled operation on n qubits, such as a controlled-swap operation 
can be performed in depth logarithmic in n. The straightforward implementation out- 



lined in Section 2.1.1 requires using one control qubit to control each of the gates in 
the operation, resulting in a linear depth circuit. Moore and Nilsson IIMN02II use a 
construction from reversible computing to reduce the depth of this technique. 

Proposition 2.1 (Moore and Nilsson IIMN02I ). Any log-depth operation on n qubits con- 
trolled by one qubit can be implemented in O(logn) depth with 0(n) ancillary qubits. 

Moore and Nilsson prove this only for the case of constant-depth operations, but the 
proof technique used also applies to the log-depth case. They prove this proposition 
using a tree of log n controlled-not operations to 'duplicate' the control qubit onto n 
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Figure 2.8: Log-depth implementation of controlled operation on n qubits 

ancillary qubits. These copies only capture the information in the computational basis, 
but this is the exactly same information that is used by the controlled gates. These 
extra control qubits can are then used to control the remaining operations, with each 
control qubit used a logarithmic number of times. Finally, the tree of controUed-not 
operations is reversed to clean up the ancillary qubits so that they can be traced out 



without decohering the system. This procedure is demonstrated in Figure 2.8 This 
implies, as an example, that the 2n-qubit controlled swap gate can be implemented in 
depth O(logn). This will be critical to the construction used in Chapter |4j 

If the unbounded fan-out gate is allowed into the standard basis of gates, then the 
depth overhead added in this construction can be reduced to a constant. This gate 
performs a controlled-not operation from one control qubit to any number of target 
qubits in one computational step. This gate is not in the standard basis for mixed- 
state quantum computing: it requires a linear number of gates and a logarithmic 
depth circuit to implement in the standard gate model. Fan-out in classical circuits 
is simply the operation that copies the value from one bit to several other bits, and 
is often included in the standard circuit model. When such a gate is included in the 
usual quantum circuit models, many tasks become much simpler. As an example, this 
gate allows operations such as sorting, phase estimation, and the quantum Fourier 
transform to be approximated with constant depth circuits |1hS05J. This gate will 
not generally be included in the model, but some of the results in the thesis can be 
strengthened when it is. 

To see how the scheme for implementing controlled operations can be implemented 



in constant depth using this gate, notice that tree structure of Figure 2.8 can be replaced 
with a single fan-out gate. This allows n 'copies' of the control qubit to be created, 
which can then be used to control each of the n operations. A final application of the 
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Figure 2.9: Constant depth implementation of controlled operation on n qubits using 
the unbounded fan-out gate. 

fan-out gate to restore the ancillary qubits to the |0) state. This is demonstrated in 



Figure 2.9 which implies the following proposition. 



Proposition 2.2. If the unbounded fan-out gate is in the basis of gates, any constant depth 
operation on n qubits controlled by one qubit can be implemented in 0(1) depth with 0[tv] 
ancillary qubits. 



2.2 Quantum complexity classes 

This section provides a brief overview of the quantum complexity classes related to 
the topic of this thesis. Many of the technical details related to the definitions of these 
classes are omitted, as a detailed understanding of complexity theory is not essential 
to the results that follow. For a more complete reference, see the recent survey of 
Watrous IIWat09bl . The known relationships between the classes discussed here and 



some of the more well known classical complexity classes are illustrated in Figure 2.10 



BQP, defined in |BV97| , is the quantum complexity class of primary importance. 



This class is informally the set of all decision problems that are efficiently solvable with 
a quantum computer. As quantum computation can involve measurements that have 
inherently probabilistic outcomes, the quantum computation that solves a problem 
in BQP is permitted to fail with some bounded probability. This probability can be 
made arbitrarily small by using the standard trick of repeating the computation several 
times in parallel and taking the majority Error reduction for this class is exactly as 
for probabilistic classical computations: this is because the inputs and outputs to the 
decision problems are classical strings that may be copied any number of times. 
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Figure 2.10: Known relationships between the major quantum and classical complexity 
classes. Classes are contained in the classes written above them. Only the containment 
P c EXP is known to be proper. 

More formally, BQP is the set of all languages L for which there exists a uniform 
family Q of polynomial size quantum circuits, one for each input length, such that 

1. if X G L,thentr(nQ(x)) ^ |, 

2. if X ^ L,thentr(nQ(x)) ^ |, 

where TT is the projector onto the subspace where the first output qubit of Q is i.e. 
the projector onto the accepting subspace for the circuit. The error bounds 2/3 and 
1/3 here are not significant: they can be replaced with any a > b that have at least 
an inverse polynomial gap between them (in the size of the input string x) as noted 
above. 

Several BQP-complete promise problems are known. The most famous of these is 
probably the approximation of the Jones polynomial, which is in BQP by an algorithm 
of Aharonov et al. | AJL06| (or by earlier works of Freedman et al. IIFKW02II ) and 
is complete for BQP by a result of Freedman et al. l|FLW02i . These problems give 
an important method for the study of quantum computation that is not necessarily 
connected to quantum information. 

Extending the definition of BQP to include a single message from a computationally 
unbounded prover results in the class QMA, which is the quantum analogue of NP. 
This concept was first considered in |Kni96 |, first defined in |Kit99| , and first studied 
in |WatOO| . QMA is the class of all problems that can be verified by a polynomial- 
time quantum verifier with access to a quantum proof. This proof is a quantum state 
on a polynomial number of qubits and may depend on the input. More formally, a 
language L is in QMA if there is a family of circuits Q such that 
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1. if X G L, then there exists p such that tr(TTQ(x, p)) ^ |, 

2. if X ^ L, then for any p, tr(TTQ(x, p)] ^ ^, 

where once again TT is the projector onto the accepting subspace of the output of Q. 
As in the case of BQP the error parameters 2/3 and 1/3 are not significant. Replacing 
these with a, b such that | a — b | is at least inverse polynomial in n is also possible in 
this case, though the argument is not simple l|KSV02[|MW05| . 

Similar to BQP, the class QMA has complete promise problems. The simplest of 
these is the 2-local Hamiltonian problem, which is informally the quantum version 
of the satisfiability problem for unitary circuits with gates of constant size. A formal 
description of this problem, as well as a proof that the 5-local Hamiltonian problem is 
QMA-complete can be found in IIKSV02I . The improvement of this result to the 2-local 



case is due to Kempe, Kitaev, and Regev IIKKR06I . 

Extending BQP further by allowing multiple rounds of interaction with a prover 



results in the complexity class QIP, first defined in [|Wat03| . This class is the quan- 



tum analogue of the classical class IP, which is equal to the more familiar class 
PSPACE IILFKN92[ ISha92 ll of problems solvable with a polynomial amount of space. 



A recent result has also shown that QIP = PSPACE | |JJUW09| , resolving a major open 
problem in quantum computational complexity. 

As a more formal definition, for any language L G QIP, there is a polynomial time 
quantum algorithm V, known as the verifier, that exchanges quantum messages with 
a prover P. Both the prover and the verifier receive the input string x before the start of 
the computation. The verifier's algorithm V must be generated from the input string x 
in polynomial time, but the prover 's algorithm is not constrained in this way. Given a 
pair (V, P) the verifier V will accept that x G L with some probability after interacting 



with the prover P. An example of this interaction is shown in Figure 2.11 with the 
Hilbert spaces available to each party illustrated. For any input x, the verifier V in a 
QIP protocol satisfies 

1. if X G L, then there exists a prover P such that, (V, P) accepts with probability at 
least |. 

2. if X ^ L, then for any prover P, ( V, P) accepts with probability at most |. 

Once again, the exact parameters used in this definition are not significant. It is 
known how to use parallel repetition in this model of computation for error reduction, 
so that as long as the probabilities are an inverse polynomial apart and the probability 
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Figure 2.11: A three message quantum interactive proof system. The verifier's polyno- 
mial time transformations are Vi and V2 and the prover 's transformations are given by 
Pi and P2. All messages are sent though the message space M and the verifier does not 
have access to the prover 's private space 7. At the end of the interaction, a measure- 
ment of the verifier's private space V determines the acceptance of the computation. 
All three of these spaces start in the |0) state. No restrictions are made on the size of 
the space 7, but the spaces M and V do not contain more than a polynomial number of 
qubits. The circuits Vi, V2, Pi, P2 may depend on the input x. The circuits Vi, V2 must 
be generated from the input x in polynomial time, but the circuits Pi, andP2 are not so 
restricted. 



of acceptance in condition |2] is nonzero, the resulting class of problems does not 
change IIKWOOI . 

An interesting property of quantum interactive proof systems is that any quantum 
interactive proof system can be simulated by one using only three messages BKWOOI . 
This is in contrast to the classical case, where constant round proof systems seem to be 
much weaker than polynomial round proof systems. For this reason, we may assume 
that any problem in QIP has a proof system as shown in Figure 2.11 in which each 
of the prover and verifier each perform exactly two transformations, with the verifier 
acting last. 

It is easy to see from the definitions that QMA C QIP, as interactive proofs with 
three messages can only be stronger than those using only one message. It is expected 
that this containment is strict: if not, unexpected things happen to classical complexity 
classes (it would imply that PH C PP) |Vya03|. There is, however, no proof that this 
cannot happen, as it would resolve a long-standing open problem complexity (showing 
that NP is properly contained in EXP). The case of two message quantum interactive 
proofs is even more interesting, as quite little is known about this class. It it known 
that the class of problems two message proofs is contained in PSPACE ]JUW09[ , 
but this result has been subsumed by the result that QIP = PSPACE using similar 
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techniques |JJUW09| . 

The class QIP also has complete (promise) problems. The Close Images problem 
is the first such problem known. This problem is implicitly defined and shown to be 
complete for QIP in [KWOO| where it is used to show that QIP C EXP. This problem 
is the focus of Chapter |4| where a formal definition can be found. 

This thesis adds several new problems to the list of problems that are complete for 
QIP. Close Images is effectively a restatement of the acceptance conditions for a quan- 
tum interactive proof system, as we will see in Section 4.2 and so these new complete 
problems provide a method for studying QIP that involve quantum information that 
are not strongly tied to the model of quantum interactive proof systems. 

The three quantum complexity classes BQP, QMA, and QIP are the only classes 
that will be encountered in this thesis. With the exception of Section 4.2 where Close 
Images is shown to be complete for QIP from the definition, it is not essential to have 
a deep understanding of these definitions. More important is to maintain the intuitive 
picture that problems in BQP are easy, problems in QMA are difficult, and problems 
in QIP are even more difficult. 
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Chapter 3 

Measures for Quantum Information 



Distances and other measures give a quantitative method to evaluate how close two 
states are together, how mixed a state is, or how well a quantum channel can be used 
to transmit information. These are all tasks that are central to the problems discussed 
in the thesis, and so we introduce several different techniques for measuring these 
quantities. It is important that there are several such measures, as most of these 
measures have an operational meaning that can help to ground the otherwise abstract 
problems that we consider. 

The primary quantities discussed in this chapter include the entropy, the Schatten 
p-norms, and the trace norm on quantum states. Also included is an overview of some 
of the extensions of these quantities to the case of channels. A brief overview of the 
problems related to the additivity of the classical capacity is also provided. 

This chapter is largely a collection of these measures, with proofs of the important 



properties that they have. The results in Sections 3.5.1 and 3.7 are the product of joint 



work with John Watrous |RW05n , and the remainder of the results discussed here are 



not new and can be found in several of the standard sources. 
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3.1 Entropy 

The entropy of a quantum state can be viewed as a measure of the amount of uncer- 
tainty about the value of the state. In support of this intuitive picture, the entropy of a 
pure state is zero as this represents the case where (in principle) complete knowledge of 
the state is present. The other extreme is the completely mixed state i, where nothing 
at all is known about the system, which corresponds to the maximum entropy. 

In the case of a classical probability distribution p, the entropy is defined to be 

S(p] -^p(x)logp(x), 

where the logarithm is taken base two. The entropy was first used in an information 
theoretic context by Shannon in 1948 |Sha48 l], who derived it from axioms that he felt 
that any such measure of uncertainty should satisfy. By convention log is defined 
to be in this equation. 

This quantity has a generalization to quantum systems that was first developed 
by von Neumann in 1927 ||vN27 | [ . This version of the entropy, applied to a density 
operator p with eigenvalues Ai, is given by 

S ( p) - - tr p log p = - ^ AUog At . (3.1) 

i 

This quantity is often called the von Neumann entropy. In the case of a probability 
distribution p encoded as a diagonal matrix with diagonal entries p(x], these two 
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quantities agree, which is why the Shannon entropy is usually thought of as a special 
case of Equation ( |3.1| >. 



Returning to the previous examples, notice that since a pure state expressed as a 
density operator has exactly one nonzero eigenvalue, the entropy is given by 

S(|il;)(-i|;|)=-llogl=0. 

In the case of the completely mixed state i on a space Oi of dimension d, there are d 
eigenvalues, each with value 1/ d, computing the entropy, we obtain 

d ^ 1 
S(l5c)=-}^-log-=logd. 

A good reference on the properties of the entropy and many of the quantities derived 
from it can be found in |BZ06| and |NCOO| . The exposition of the entropy presented 
here follows these sources. 

One property of the entropy that is useful is that it is additive with respect to the 
tensor product. To see this, let p = Ai|i|)i)(i|)i| and a = ^|yi|({)i)((t)i| be two density 
operators. Expanding the definition of the entropy, we have 

S(p®ct) = -^^AiYjlogAiyj =- Aiyj log A| - A^yj logy j = S(p] + S(a), (3.2) 

where we have made use of the fact that since p, cr are density operators ^.i-^i — 
X.j Tj =1- This implies that for a multiparty quantum system that is not entangled, 
the entropy of the complete system can be determined locally. This is not true for 
entangled systems: if two parties each share half of a maximally entangled state, the 
local entropies are maximized, but the global state is pure, so it has zero entropy. 

It is easy to see that the entropy is always nonnegative: the eigenvalues of a density 
matrix are always in the range [0, 1] as all density matrices are by definition positive 
operators with unit trace. It is more difficult to see that on a Hilbert space of dimension 
d, no state can have entropy greater than log d. One way to see this intuitively is to 



notice that since the logarithm is concave. Equation (3.1 1 is maximized when there are 
as many eigenvalues as possible, each of them being as small as possible. Formalizing 
this argument will require the use of Klein's inequality. The inequality stated here is a 
special case of Klein's 1931 result, but it is all that will be needed. 

Theorem 3.1 (Klein's Inequality IIKIe31ll ). Let p,a e D(J{), then 

tr(plogp) -tr(plogCT) ^ 0, 

with equality only when p — (j. 
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This inequality immediately implies that the completely mixed state is the unique 
state with maximum entropy 

Proposition 3.2. Let^ be a Hilbert space of dimension d, then for any p G D(IK) 

^ S(p] ^ logd. 

Furthermore, S(p) =0 implies that p is a pure state, and S(p] = log d implies that p — t^. 



Proof. The first inequality is simple: if rank(p) > 1 then, by the strict concavity of 
the logarithm on positive values, S(p) > 0. On the other hand, if p is pure a simple 
calculation reveals that S(p) =0. 



The second inequality is a direct consequence of Klein's inequality (Theorem 3.1 1 
applied to p and tji'. 

^ tr(plogp- plogij^;) =logd-S(p). 

This implies that S(p) ^ logd, with equality only when p — ijc, by the equality 
condition of Klein's inequality. □ 

Klein's inequality can be used to prove another important property of the entropy: 
concavity. This property is similar to the triangle inequality, except that the inequality 
goes in the opposite direction. 

Proposition 3.3. Let p, cr, £, G D(IK), with p = qa+(l — q)f„ where ^ q ^ 1. Then 

S(p)^qS(a) + (l-q)S(y. 



Proof. Expanding the definition of S, we have 



S(p) = -tr plogp = -qtralogp - (1 - q)tr£,logp. 



(3.3) 



Klein's inequality (Theorem 3.1 ) implies that tr 0log cr ^ tr 0log p, and similarly for £, 
in place of cr. Together with Equation ( 3.3| >, this implies that 



S(p) ^ -qtro-loga-(l-q]tr^log£, = qS(o-) + (l-q)S(£,), 



which is the desired inequality. 



□ 
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3.1.1 Minimum output entropy 



The entropy can be extended to quantum channels in a straightforward way: by 
minimizing the entropy over the output states of the channel. The resulting quantity 
is known as the minimum output entropy, which is defined for a transformation <t> E 

S^in(0)= min S((D(p)). (3.4) 

This extension of the entropy to quantum channels, as well as the properties of the 
entropy that have been demonstrated here will be essential to the results of Section [7!6) . 
concerning the additivity of the minimum output entropy on the tensor product of 
two channels. This is closely related to the capacity of a quantum channel for the 
transmission of classical information. This is discussed in Section |33l 



3.2 Schatten p-norms 

One of the more useful distance measures that can be defined on quantum states comes 
from the Schatten p-norms IISch60i . This is the extension to operators in L( J{, X) of the 



usual Ip norm of a sequence {x^}, which for 1 ^ p < oo is defined by 

^\xin . (3.5) 



For p = cxD, this norm is given by ||x||^ = sup. |X||, which can be obtained by taking 
the limit of Equation ( 3.5| > as p oo. The extension of this norm to an operator 



A G L(J{, is done by taking this norm on the singular values of A, so that for 

1 ^ p < oo 

||A||p = (trlAr)^ = ||s(A)||p, (3.6) 

where s(A) is the (finite) sequence of singular values of A. The extension to the case 
p = cxD is, as in the vector case, given by || A||^ = maxi Si(A). Two of these norms are 
widely used in quantum information: the case p = 1 corresponds to the trace norm. 



considered in more detail in Section 3.4 and the case of p = oo corresponds to the 
usual operator norm on L(IK, X). This section discusses some of the most important 
properties of these norms. A more complete overview of the properties of these norms. 



as well as the properties of more general classes of norms, can be found in ||Bha97| 



Recall from Definition 1.1 that any norm satisfies the three properties of nonneg- 



ativity, homogeneity, and the triangle inequality. The first two of these properties 
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are easily to verified for these norms. Equation ( 3.6[ > is only zero when all the singu 



lar values are zero, which establishes nonnegativity (Equation ( |1.1[ >). Homogeneity 
(Equation ( |1.2[ >) follows directly from the definition of the absolute value of a matrix 
and the linearity of the trace. 

Verifying the triangle inequality (Equation ( 1.3| >) for this norm is nontrivial, and so 



only a brief overview of this result is presented here. The most important part of this 
proof is a 1951 result of Ky Fan UFanSlI , which is a majorization relation on the singular 



values of A + B in terms of the singular values of A and B. As is common in this thesis, 
the finite-dimensional result that is presented is a considerable simplification of the 
known result, which holds in the infinite dimensional case. 

Theorem 3.4 (Ky Fan |Fan51| ). Let A,Bbenxn matrices, and let s(A] denote the sequence 
of singular values of A in decreasing order, then for all k g {1, . . . ,n} 

1c k k 

Y_ Si(A + B) ^ }^ St(A) + }^ s,(B), (3.7) 

i=l i=l i=l 

and more generally, ifx is any symmetric gauge function, 

t(s(A + B]) ^ t(s(A)] +t(s(B)). 

The triangle inequality for || ■ ||p in the cases that p = 1, cxd is a direct consequence 
of Equation ( 3.7) for k = n, 1. The triangle inequality for the remaining values of 



p follow from Fan's theorem and the fact that for a vector x G R"^, the function 
= (^r=i l^iT]^^^ is a symmetric gauge function. This is a strengthened version of 
the property that the function t(-) is a norm (in the case, the £p norm). More details, 
as well as detailed arguments that || ■ ||p is a norm using the theory of symmetric gauge 
functions can be found in IIBha97[[HJ91[. 



The p-norm satisfies two more properties that will be essential to the results in 
Chapter |7j The first of these is unitary invariance This is the property that for any 
unitary operators U, V 

l|UAV||^ = ||A||p. (3.8) 

It is easy to prove that this property holds. Consider a singular value decomposition 
of A given by ^.i Silcl^i) then a singular value decomposition for UAV is given by 

^s,(U|cl)t))((i|;t|V), 

i 

from which it can be seen that both A and UAV have the same singular values. Then, 
as the p-norm is defined in Equation ( 3.6[ > solely in terms of the singular values of A, 



the p-norms of A and UAV must be identical. 
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The second important property of the p-norm is that it is multiplicative with respect 
to the tensor product of two operators. That is, for operators A, B 

||A®B||^ = ||A||J|B||p. (3.9) 

This property follows directly from the properties of the tensor product. Let singular 
value decompositions of A and B be given by A = Si|ct)i)(i|>i| and B = X.i 
then the a singular value decomposition of A C?) B is 

^st|cl),)(ipilj ® ^^tj|yj)(A/j|j =21sttj|cl)t)(x|;i|® |yj)(^jL (3.10) 

so that the singular values of the tensor product A (g) B are simply the products of the 
singular values of A and B. From this relationship Equations (3.5) and (3.6) imply the 
desired property. 



3.2.1 Maximum output p-norm 

These norms have an important extension to channels. This is the maximum output 
-p-norm, which, for a channel G T(IK, 3C) is denoted -Vp(O), or sometimes simply 
II O lip. For 1 ^ p ^ oo, this norm is given by 

^p(O) = max ||cI>(p)L. (3.11) 

This quantity is normally defined by taking the maximum over all inputs X G L{'K] 
with II X II ^ = 1, but a result of Amosov and Holevo BAH03I implies that in the case of O 
completely positive, this maximization can be restricted to the density operators. Note 
that this simplification cannot be made in the case of the difference of two channels, as the 
resulting operation is not completely positive, by a counterexample found in llWatOSI . 



In this thesis the maximum output p-norm will only be applied to channels, never the 



difference of two channels, so the simplification of the definition in Equation (3.11 1 is 
justified. 

In the next section, this quantity is related to the capacity of a quantum channel for 
the transmission of classical information, the exact specification of which is currently 
an important problem in quantum information. 



3.3 The classical capacity of a quantum channel 

The additivity of the capacity for a quantum channel to communicate classical infor- 
mation is one of the most important unresolved problems in quantum information 
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theory. Informally, the additivity problem is: given two uses of a quantum channel, 
is it possible to send more than twice the classical information that could be sent with 
a single use? This is a common oversimplification; the classical capacity is defined in 
terms of the average amount of information sent per channel use, asymptotically with 
the number of uses of the channel [|Hol98[ ISW97I . A more correct statement of the 



problem is: when encoding for the transmission of classical information, is entangle- 
ment across multiple uses of the channel necessary to achieve the best communication 
rate? This refined problem stood open for over 10 years before a counterexample was 
recently found by Hastings [iHas09| . 

In this section this problem is given a formal definition and the closely related 
problems of the additivity of the minimum output entropy and the multiplicativity 
of the maximum output p-norm are discussed. The minimum output entropy and 
maximum output p-norm both involve maximizing the purity of the output of a 
channel, a problem that is intuitively related to the classical capacity by the notion 
that a channel that is less noisy should be able to send more information. A recent 



survey of these problems can be found in [Hol06| , though it does not include the 



recent counterexamples to both additivity IIHas09l and the related problem of the 



multiplicativity of the maximum output p-norm IIHW08I 



The classical capacity of channel O, when the input to multiple uses of the channel 
is restricted to product states, is given by the x-capacity |Hol98lBW97| 



C^(0) = max [S(^p,(D(pt)) - ^ ptS(O(p0)], (3.12) 

i i 

where the maximum is taken over all convex mixtures X^i^PiPi of quantum states. 
This quantity is also referred to as the Holevo capacity or the "one-shot" or "one-step" 
capacity of O. The question of the additivity of this quantity, i.e. can entangling inputs 
across multiple uses of the channel be required to increase the capacity, was first raised 
in IIBFS97II , and the until recently standing conjecture was that 



C^(0®¥) = Q((D) + C^(V). (3.13) 

This is the statement that entangled inputs do not increase the classical information 
carrying capacity of quantum channels. This conjecture has recently been shown false 
in general by a result on the additivity of the minimum output entropy l|Has09L This 
result implies that the maximum rate that classical information can be transmitted 
using a channel is given by 

0(0) = lim -Q((D®^). 

n^oo n 
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If it were that case that were additive, this formula would simplify toC((D) = C^lO), 
but it is now known that this is not the case. This leaves open the question on many 
restricted classes of channels: a survey of some of these special cases can be found 
in lim06 l. 

The x-capacity captures exactly the amount of classical information that can be 
transmitted per use of the channel when encoding with product states, but it is some- 
what awkward to work with. In the effort to resolve the additivity question it has been 
related to both the minimum output entropy and the maximum output p-norm. 



3.3.1 Relation to the minimum output entropy 

The minimum output entropy, defined by Equation ( |3.4| >, is simpler and often easier 
to work with than the x-capacity. The additivity of this quantity, given by 

S^((D ® = S^((D) + S^^iW), (3.14) 



was first conjectured by King and Ruskai UKROll , who attribute the conjecture to Shor. 



The additivity of this quantity is connected to the additivity of the x-capacity by a 
result of Shor | |Sho04| that shows that both of these conjectures are globally equivalent 
to a third conjecture: the strong super additivity of the entanglement of formation. 
Hastings has recently given a probabilistic construction that shows that this conjecture 
is false in general [iHas09J , which also implies the non-additivity of the x-capacity. 

One direction of Shor's construction in | |Sho04| to show that the additivity of is 
equivalent to the additivity of Smin is very complicated, but the other direction is quite 
simple. 



Theorem 3.5 (Shor ||Sho04P ). The additivity of implies the additivity of 



Proof. Let Oi, <1>2 be arbitrary channels in O G T['K,%]. We will construct channels 
cD^ in the larger space T(e J{, X). The chaimel 0( uses the input space G^XxX 
to determine which of the discrete Weyl operators to apply to the output of (D^, i.e. 



where the unitaries Wa,b are the discrete Weyl operators introduced in Section 1.2 
This process is applied decoherently, which is to say that 0( measures the space C in 
the computational basis to decide which operator to apply. As we will later show that 
the uniform mixture of the discrete Weyl operators forms the completely depolarizing 
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channel (Proposition 7.2[ >, the result of placing a completely mixed state in the input 



space C results in the completely mixed state as output, which is 

0{(ie ® p) = 1%. 

Let pi and P2 be states achieving the minimum output entropy for (Di and O2, 
respectively We will assume that is additive, and show that if Smin is not additive 
on <l>i ® O2 then we can find a contradiction. To do this we compute C-^{0[] with the 



maximization in Equation (3.12 ) restricted to inputs of the form Ig pi, which is given 
by 

S(ix) - Y. S(Wa,bOt(pOW;j = logdimX - Y. S(^i(Pi)) 

a,b a,b 

= logdimX-S(cDi(pt]) 

= logdimX-S„,in(Ot). (3.15) 

Notice that this is the optimal value of C-^[0[), since the first term is maximized and 
the second term is minimized. This implies that restricting to inputs of the form ie ® Pi 
does not reduce the value of Cy.{0[). As we have assumed that is additive, this 
expression also gives the optimal value of C^IO^ ® O^). However, if Smm(^i ® ^'2) is 
not additive, then any state a on which 

S(((Di ® 02)(a)) < S^in(Oi) + S^in(02) 

can be used to increase C^[<D[ (g) Oj), exactly as in the derivation of Equation ( |3.15[ >, 



contradicting the (assumed) additivity of C^. □ 
This result of Shor ||Sho04| , coupled with Hastings' results BHas09l shows that 



is non-additive in general. As the proof is constructive, it also shows that if Smin 
is not additive on a class of channels, then is also not additive for the related 
class of channels given by applying the construction in the proof to all the channels 
in the class. This result is included here as it is of a similar flavour to many of 
the results in the thesis: reducing one problem to another, by embedding arbitrary 
channels into channels with specific properties, is a powerful method for obtaining 
results. In fact, Fukuda has used the same construction to show that the additivity of 
the Holevo capacity and the minimum output entropy can be restricted to the unital 
channels without loss of generality l|Fuk07| . As in the case of the x-capacity, these 
non-additivity results leave open the question of on which classes of channels does the 
additivity of the minimum output entropy hold? This is currently an important open 
problem in quantum information, as a deeper understanding of the channels for which 
the minimum output entropy is not additive may lead to a deeper understanding of 
those channels for which is not additive. 
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3.3.2 Relation to the maximum output p-norm 



In an effort to better understand the question of the additivity of the minimum output 
entropy, yet another question has been raised. This problem is the multiplicativity 
of the maximum output p-norm, which was first conjectured by Amosov, Holevo, 
and Werner flAHWOO I I . The conjecture corresponding to this quantity was that it is 
multiplicative with respect to the tensor product of two channels, i.e. that 

^p((D®¥) =^p(Ohpm. (3.16) 

This conjecture is not true in general: channels can be constructed for any fixed p > 1 
that falsify this conjecture 



Amosov, Holevo, and Werner have related the multiplicativity of "Vp to the additiv- 
ity of Sjnin HAHWOOL This relationship can be used to show that the minimum output 
entropy is additive for a pair of channels if and only if the maximum output p-norm 
is additive on the same channels, for values of p close to 1. 

Theorem 3.6 (Amosov, Holevo, Werner PAHWOOj). Let Oi, O2 e T(J{,3C). If for some 
sequence pi — > 1, with p^ ^ 1/or all i, it holds that 

^p,((Di ® O2] =A'p,((Di)^p,((D2), 

then the minimum output entropy satisfies 

Sinin(Oi (g) O2) = SMn((I>l) + S^nini^l] 

Proof Let the sequence {pt} be as in the statement of the theorem. The result is easy to 
verify by introducing the quantum Renyi entropy of order p of a density matrix a 

Rp(cT] = :r^logtra^ (3.17) 
1 — p 

The important property is that taking the limit as i ^ 00 (so that p ^ 1 from above) 
results in the usual entropy (up to a factor due to the base of the logarithm) 

lim Rp,(cT) = (loge)S(ff). 



This can be verified by invoking I'Hopital's rule on Equation ( 3.17[ >. By observing that 



the logarithm of the p-norm can be used to recover Rp, we may conclude from the 
previous equation that 

lim log II all = lim Rp.((T) = (loge)S(cT). 

t^CxD i — p^ I^CO 
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From which the multiplicativity of the maximum output p-norm immediately im- 
plies the additivity of the minimum output entropy This follows from the fact that 
log"Vp^ [W) ^ as "Vp^ [W] ^ 1 for any channel W. More concretely, we have 

1 V 

Smin(0i®02) = lim — ^-^log^p.(cDi ® (D2) 

log e i^oo 1 — Pi 

1 ' 1 ' 

= ] lim - — ^log^p^(cDi) + lim - — ^ logA/p. ((D2) 

logei^ool— Pi loget^ool— Pi 

= Smm(Oi)+Smin((D2), 

as desired. □ 

The minimum output entropy and the maximum output p-norm will be encoun- 
tered again in Chapter[7| where it is shown that the additivity (respectively multiplica- 
tivity) of a channel can be rephrased in terms of the additivity (multiplicativity) of a 
related set of mixed-unitary channels. 



3.3.3 Additivity and multiplicativity on classes of channels 

The questions of the additivity of the Holevo capacity (also called the x-capacity) and 
the multiplicativity of the maximum output p-norm have been resolved for many of 
the restricted classes of channels that are studied in this thesis. 



It is shown in HCRSOSI that the additivity of degradable channels is equivalent 



to the additivity of general channels, using a result from BFW07I . Combining this 



result with the result that the additivity problems are equivalent on a class of channels 
and the class of complementary channels |Hol07[ iKMNROTj shows that the additivity 
of the antidegradable channels is also equivalent to the general case. The recent 



result of Hastings IIHas09l can then be used to show that there exist degradable and 
antidegradable channels that are not additive. It is perhaps not a surprise that these 
channels are well-behaved: the degradable and antidegradable channels cannot be 
used to transmit quantum information to the environment and receiver, respectively, 
because to do so would violate the principle of no-cloning. 

It is also perhaps not a surprise that on the entanglement breaking channels the 
minimum output entropy is additive [Sho02] and the maximum output p-norm is 
multiplicative ||Kin03L The problem of distinguishing channels of this class, however, 
is quite interesting and remains open. 

The unital channels cannot decrease the entropy IIKROIP . This property makes them 
interesting from the perspective of additivity, as channels that do not reduce entropy 
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would seem to be a natural noise model. Fukuda has shown how to construct a unital 
channel from a general channel, without changing the minimum output entropy or 
the maximum output p-norm | |Fuk07| , using a the same construction used by Shor to 



show that the additivity of Smin implies the additivity of IISho04l . This implies that 



for a set of channels the question of additivity can rephrased in terms of the additivity 
of a related set of unital channels. 

The mixed-unitary channels are a subclass of the unital channels. In the case 
of qubit mixed-unitary channels, both additivity and multiplicativity are known to 
hold |Kin02| . For general mixed-unitary channels, both additivity |Has09| and mul- 
tiplicativity IIHW08I are known to fail. In fact, all of the recent counterexamples 
to additivity and multiplicativity are obtained by choosing a random mixed-unitary 
channel from some distribution. This makes these channels very interesting. It is 
shown in Chapter [7| that the additivity or multiplicativity for a general channel can be 
reduced to the approximate additivity or multiplicativity of a mixed-unitary channel. 



3.4 The trace norm 

The trace norm is perhaps the most important measure of size and distance in quan- 
tum information. The trace norm of the difference of two states measures how distin- 
guishable the two states are, which makes this an essential quantity for the problems 
considered in this thesis. The remainder of this section surveys some properties of 



this norm, further background on the trace norm can be obtained in the books IINCOOII 



and IIBha97i . 



We have already encountered the trace norm: it is simply the p = 1 case of the 



Schatten p-norm discussed in Section 3.2 This implies that || X || ^j. is given by the sum 
of the singular values of X, though it is often useful to define this norm by an explicit 
formula. Such a formula can be obtained by noticing that the singular values of X are 
exactly the square roots of the eigenvalues of the positive operator X*X. Using this 
observation, the trace norm can equivalently be defined as 

||X||t^ = ||X||i =tr v/>00(. (3.18) 

The trace norm inherits many properties from the p-norm. The triangle inequality is the 



k = n case of Fan's Theorem (3.4), and unitary invariance is given by Equation (3.8 1. 
One other convenient property is that || p||ti. = 1 for any density matrix p, which is 
implied by the fact that density operators are positive semidefinite operators with unit 
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trace, so that the eigenvalues are all positive and sum to one. Several other properties 
of the trace norm can be easily derived from the following characterization. 

Lemma 3.7. For any X G L(IK), 

||X||, = max |trXU| 

Proof. Let X have a singular value decomposition given by X = Si|(l)i)(i|)i|, where 
{\<^i)} and {lil^i)} are orthonormal bases for 'K. Then for any unitary U G U(J{) 



ItrXUl 



}^St(i|;tlU|ct),) 



^ Si I (il^i|U|4)t) I ^ St = II X||^^ . (3.19) 



If the unitary U is chosen such that U|c|)t) = then 

(ibilUlct),) = (il^tliPt) = 1, 



for all i. In this case equality is achieved in Equation (3.19 1. □ 



From this characterization it is easy to see that the trace norm does not increase 
under the partial trace, and in fact, does not increase under the application of any 
channel. This is intuitively obvious: applying any potentially noisy operation to two 
states cannot help to distinguish them. 

Theorem 3.8. Let cD e T{:K,X], then for any p, ct G D(J{) 

||(D(p) -(D(0)||j^ ^ l|p-(^lltr- 

Proof We first prove this for the case that O G T(IK ® "B, X) is the partial trace over the 



space B. For any p, cr G D{X ® "B), this follows directly from Lemma 3.7 



smce 



lltr^ p — tr^ ffllt = max |tr [(a — p](U (g) < max |tr(o" — p)U| = || p — a|L_ . 

To see the general case, let O G T(CK, %) have Strinespring representation given by 
0(p) = tr^ U(p (g) |0)(0|)U*. Then the previous equation and the unitary invariance of 
the trace norm imply that 

II (D(p) - (D((t] 11^^ = lltrs U [(p - ff) ® |0)(0|] U* 11^^ 
< ||U[(p-o-)® |0)(0|]U*||^^ 
= ||(p-(T)®|0)(0|||^, 

= \\P-(^\L' 

as required. □ 
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The following theorem due to Helstrom ||Hel67| formalizes the notion that the trace 
norm of the difference of two density matrices represents how well they can be distin- 
guished by a measurement. This result underlies the definition of the quantum circuit 
distinguishability problem in Chapter |5| as the problem of distinguishing channels 
is exactly the problem of distinguishing the outputs of the channels. This result is 
easy to generalize to the case that the two density matrices are not chosen with equal 
probabilities, but doing so unnecessarily complicates the argument. 

Theorem 3.9 (Helstrom ||Hel67| ). The optimal probability that an unknown state £, G D([K) 
that is chosen uniformly at random from the set {p, g] can be correctly identified is given by 

1 l|p-g|ltr 

2 4 

Proof. The optimal strategy consists of some two-outcome POVM measurement. By 
Naimark's theorem it may be assumed that the optimal measurement is a projective 
measurement TTp, Tier with TTp + Her = 1:k/ since the operation that embeds p and ff 
into a larger space is an isometry, which will not affect the trace norm, by unitary 
invariance. The probability that this measurement succeeds is 

1 1 

Psucc = -tr(npp) + -tr(naO-). 

Similarly, the probability of failure is 

1 1 
Pfaii = 2 tr(nap) + - tr(np(T). 

Subtracting the probability of failure from the probability of success gives the bound 

1 1 

Psucc -Pfaii = 2 ((Hp -na)(p- 0-)) ^ - IIp- ct||j^, (3.20) 

where the inequality follows from Lemma 3T|by the fact that TTp — TTjj is a unitary 
operator. Adding this equation to the equation Psucc + Pfaii = 1 results in the bound 

1 n 

2psucc ^ 1 + 2 llp-o"lltr- (3-21) 

This is the probability given in the statement of the theorem, and so it remains only to 
show that it can be achieved. 

To see this, consider the projectors TT+ and TT that project onto the positive and non- 
positive eigenspaces of p — a. Re-examining Equation ( 3.20| > with this measurement 
results in 



^tr((n+-n_)(p-a)] = ^tr|p-ff| = ^ 
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since the eigenvalues of the Hermitian operator (TT+ — TT )(p — cr) are the absolute 
values of the eigenvalues of p — cr. This demonstrates that this measurement achieves 



the bound in Equation ( |3.21[ ). □ 



There is one final property of the trace norm that is needed in Section 3.5.1 and 
Chapter |5j This property relates the trace norm of an operator to a Hermitian block 
matrix that is closely related to it. This construction is often useful: it is one way to 
take a general linear operator and construct a Hermitian operator on a space including 
one additional qubit. The proof of this relationship is not difficult or particularly 
illuminating, but it is included as this result is critical to the some of the proofs that 
follow. 

Lemma 3.10. Let A G L{3-C) be a linear operator, then 

|||0)(1|® A + |1)(0|® A*||j^ = 2||A||j,. 

Proof. Let r be the rank of A and let n — dim "K. If r = the result is trivial, so we may 
assume that r > 0. Let A = |0)(1| O A + |1)(0| (g) A*. Written as a block matrix, this 
operator is 

^0 A^ 
A* 



A 



For the evaluation of the trace norm of A it suffices to consider the eigenvalues, 
as this operator is Hermitian by construction. To compute these eigenvalues, let A 
have singular value decomposition A = Y.l=i ^i\'^i){'^i\> where and are 

orthonormal sets of vectors. Let the notation denote the vector of length 2n 

whose first n entries are the entries of cf) and whose second n entries are the entries of 



\1>. As observed in [iBha97l Section ILl], the 2r nonzero eigenvalues of A are given by 






for i G {1, . . . , r}. This implies that 

r 

||A||^^=2^|stl=2||Al|,,, 

i=l 

as desired. □ 

The trace norm can be extended from states to channels, though some care must be 
taken in doing so to ensure that the resulting norm retains the desirable properties of 
the trace norm, such as the relationship to the optimal distinguishing probability. This 
extension is the focus the next section. 
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3.5 The diamond norm 



In this section the diamond norm is introduced and studied. This norm defines the 
distance measure that is central to the results that follow on the distinguishability of 
channels. The norm is introduced and some of the basic properties of the norm that 
can be found in the literature are discussed. Further background on the diamond norm 
can be found in llKit97llAKN98l and IIKSV02II . 

The diamond norm is a norm on channels with similar properties to the trace norm 
on quantum states. It will give a numerical value to the distance between quantum 
channels, and, more importantly, as in the case of the trace norm, it will be closely 
related to how well two channels can be distinguished. 

The straightforward way to extend the trace norm to quantum channels is to do as 
we have done for the entropy: optimize the output over all input states. Doing this 
for the trace norm results in the norm given defined by 

\\Ol^= max (3.22) 

This results in a norm, as it inherits many of the properties of the trace norm directly. 
As an example, it is easy to see that this norm obeys the triangle inequality. One of 
the other properties inherited by this norm is submultiplicativity. This property will be 
useful, and so a short proof of this fact is given below. 

Lemma 3.11. For any O G T(3C, 3^] and any ^ E T( J{, X), 

Proof. Let O,^ be as in the statement of the theorem. Using the definition of the trace 
norm on channels in Equation ( 3.22| > we have 



H.||.,ll'i'll,.= max max "^O^'".. H*'^' 



xeL(3C) YeL{:K) ||X||tj. || Y||jj. 



Itr II ^ II tr 

i^'mY))iitrii^miitr 



> max 



omY))iitr 



max 

YeL(J£) ||Y||j^ 



Itr ' 



^\\OoW\ 

as in the statement of the lemma. □ 
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One other useful fact related to this norm is that it is always achieved on an input 
operator of the form |4)) This follows from a direct convexity argument. 

Proposition 3.12. For any O : L( J{) L(!K), there are ipure states |4)) and \\\)) such that 

\m,r = \m\'^){M]K- 

Proof. Let X G L(IK) achieve the maximum in the definition of the norm, and let 
X = X.i Sil<t'i)(4'il be a singular value decomposition of X. Applying the triangle 
inequality to the definition of the operator trace norm (Equation ( |3.22| >) results in 



Itr 



l^lltr ll^lltr ll^lltr 



Then, since the definition of the trace norm implies that || X||jj. = Y-i^i' niay view 
this as a weighted average over terms || 0(|(l)i) 11^^.. Since at least one of these terms 
must be at least as large as the average, there exists an i for which 

||(D||,,^ ||(D(|c|)t)(i|;t|]||t„ 
which implies that the trace norm is achieved on the state |4)i) (ijjil. □ 

Unfortunately, this extension of the trace norm to channels does not have all of 
the properties that we might like it to have. The most important of these is stability, 
which is, the norm of an operator should not depend on the existence of a reference 
system, or more concretely, the norm of the map O should not be smaller than the norm 
of the map O (g) I. An example due to Watrous llWatOSI provides two channels 0,¥ 
on d-dimensional states such that ||0(g)I— ^®I||jj. = 2 for an appropriate reference 
system, but ||0— ¥||jj.GO(l/d). Phrased in terms of distinguishability, these are two 
channels that are perfectly distinguishable with a reference system but almost identical 
without it. 



For this reason, we make use of the diamond norm, introduced by Kitaev |Kit97| . 



This norm defines in the reference system, which stabilizes the super-operator trace 
norm. 

Definition 3.13. For a linear map O taking L([K) to L(3C), the diamond norm of O is 

CD = (D®I^ = max — ^i. 
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This norm is closely related to the completely bounded norm studied in operator 
algebra. If cD maps L(J{) to L(D<:) and cD* is the adjoint map defined by tr(A*a>(B)) = 
tr(((D*(A))*B), then ||0||^ = ||cD* ||^^,. More information on the completely bounded 



norm can be found in IIPau02i 



As in the case of the super-operator trace norm, this norm inherits many properties 
from the trace norm, such as the triangle inequality and invariance under unitary 
operations. From a computational perspective, the optimization in the definition can 
be cast as a semidefinite program |Wat09c | | or as a more general convex optimization 
problem IIBATS09I . The paper | JKP09| also gives a heuristic for evaluating this norm. 



It is not too difficult to see that this norm is stable. This was first noted by Ki- 
taev |Kit97|| for the diamond norm, and by Smith ||Smi83l for the equivalent case of 



the completely bounded norm. The simpler proof used here can be found in llWatOSI , 
though a similar argument appears in |GLN05| , for the case that the maximization in 
the definition of the diamond norm is restricted to the density operators. 



Theorem 3.14 (Kitaev IIKit97ll . Smith llSmiSSII ). Let (Dbea linear map from L(:K) to \.[%]. 
For any space 3^ 

||(D||^ = ||0 ® IjcIIj^ ^ ||cD ® I^llt^. 

Proof. In the case that dim 5" < dim^-C the statement of the theorem is clear: the 
maximization in the definition of the super-operator trace norm is being taken over a 
smaller space. 



In the case that dim 3" ^ dim J{, Proposition 3.12 implies that there exist vectors cj)) 
and such that 

||(D®I:^||,, = |K(D®I:^)(|cl))(-i|;|]||^^. 

If we take Schmidt decompositions of these vectors, they can have at most d = 
mtn{dim 5", dim "K] = dim terms. Doing so, we have 

d d 

14)) = }^ A,|at)|x,), \A>) = }lyi|bi)|-yi), (3.23) 

i=l i=l 

where {|ai)},{|bi)} are orthonormal bases for %, and {|xt)},{|'yi)} are bases for d- 
dimensional subspaces of 5". The remainder of the proof involves the straightforward 
but technical argument based on the Schmidt decomposition that we can embed these 
subspaces into a space of dimension d with no loss in the value of the norm. This 
is simply a formalization of the observation that since the states |4)) and ItJj) live in a 
d-dimensional subspace of 5", we do not need the auxiliary space in the diamond norm 
to have more than this dimension. 
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To formalize this, let U, V G U(IK, 3") be the isometries that take the standard basis 
{ji)} of 'K to the bases {xi} and {yi}, respectively. The maps UU* and VV* are then the 
projections onto the spaces spanned by {xi} and {yi}, which do not affect jcf)) and |i|>) 



by Equation ( |3.23| >. Using this notation 



|0 (g) I 



II (O ® I:k)[(1:k ® U*U)mt:K ® V]] H^, 

II (Ijc ® U)((D ® I:k)[(1:h ® U*U)mt:K ® VjKl^ ® V*; 

II (O ® I:^)[(1:k ® UU*)|(t))(-i|;|(l:K ® VV*)] ||^^ 

||((D®I:^)(|cl))(-i|;|)||^^ 

11^® Is-lltr' 



Itr 



as desired. 



□ 



In addition to stability, this norm has several other convenient properties. One 
of these is multiplicativity with respect to the tensor product. This is not true of the 
maximum output p-norm for any p > 1 IIHW08I , but in the case of the diamond norm, 
it is a direct consequence of the submultiplicativity of the trace norm. 

Theorem 3.15. Let O e T{:H,X] andW e T(9^,g). Then 

||(D®¥||, = ||0||J|¥||,. 



Proof. One direction follows immediately from the multiplicativity of the trace norm 
with respect to the tensor product (Equation ( |3.10| >), since 



Id) ®¥| 



max 

xeL(:K®s-) 



;(i>®^)(x)iit. 



^ max 

xeL(:K),YeL(3-) 



l|X||,, 

|a)(x) 0¥(Y)| 

IIX0YIL, 



l<i>ll ll^ll . 

I ^ Ho II ^ Ho 



The other direction is a consequence of Theorem 3.14 and Lemma 3.11 



|0 ®¥| 



O ®¥® I:K®:j|ltr 

^ ® Is-®jc®:j oW0 I:k®j<:®3- Iltr 
O (g) L 



3^®3^ II tr 

O ® I:H|ltr 11^® I:?lltr 

oil ||¥|| , 

^ Ho II Ho ' 



which completes the proof. 



□ 



60 



As promised, the diamond norm of O — ^ gives the probability that an unknown 
channel in {O,^} can be correctly identified with only a single use of the channel. 
This gives an important operational characterization of the diamond norm that has 
many useful applications to quantum error correction and other fields. The proof 
follows directly from the definition of the diamond norm and Helstrom's result on the 



minimum error distinguishability for two states (Theorem 3.9 1. 



Corollary 3.16. The optimal probability that an unknown channel O G T['K,%) chosen 
uniformly at random from {Oi, O2} can be correctly identified given a single use is given by 



+ 



Oi - (D2I 



Proof. By Theorem 3.17 that follows (and is proven independently of this result) the 



maximization in the definition of the diamond norm may be taken over a pure state, 
so that for some space "J, there exists a pure state G CK ® 3" such that, the value in 
the statement of the theorem is equal to 

1 



^ (2 + II ((Di ® i:^)(iii))(in) - (02 ® i^m)m K) ■ 

This expression is simply the optimal probability of identifying the state 



(3.24) 



from the set 



((D®Iy)(|il;)(iH] 



{(Oi®I:^](|x|;)(iH),((D2®I^)(|ilj)(i|;|)}, 



by Theorem 3.9 



Given only one use of O, there is no other strategy than applying it to (a portion of) 



some optimal distinguishing state and then measuring the result. Theorem 3.17 implies 
that there is such a state, and so the optimal probability is given by Equation ( 3.24| >, as 
required. □ 

With this corollary in mind, we will define the computational problem of distin- 
guishing two channels in terms of the diamond norm of their difference. This is the 
main problem studied in this thesis, and so the properties of the diamond norm that 
we have so far defined will be useful throughout. 



3.5.1 Maximization on a pure state for the difference of channels 

In this section it is shown that when applied to the difference of two channels the 
diamond norm is achieved on a pure state. The result is technical, but it has many 
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applications in the remainder of the thesis. This is the product of joint work with John 
Watrous IIRW05II . 

The theorem applies to the maps that are the difference of two completely positive 
maps. This property implies another simple property: there exist completely positive 
W, r such that cD = ^ - r if and only if cD (X* ) = O (X) * for all X. We will only need one 
direction of this equivalence. If (D = W — V, taking Kraus operator decompositions of 
the completely positive maps W and V implies that 

cD(X*] = ^(X*) - r(X*) = Y_ AiX*A* - BtX*Bt = ^i^^l - BiXB^ j = (D(X]*. 

This implication will be used in the proof of the main theorem of the section. In the 
construction used for the proof of this theorem, the space J' has dimension 2 dim J{. 
As explained following the theorem, this can be achieved with a space of dimension 
dim "K, though the argument is not included in the proof of the theorem for simplicity. 



Theorem 3.17. LetO: L(J{) L(!K) bethedifferenceof two completely positive maps. There 
exists a Hilbert space 5" and a unit vector |\1») G ® 5" such that 

||(D||, = l|(0®I:^)(|x|;)(ibl)||,,. 

Proof. By the definition of the diamond norm 

= II (D®I:K|ltr = max{|K(D®Ij<;)(X)||j^: II X 11^^ = 1}. 

Let X G L(J{®?C)bea state that achieves this maximum and let C be a Hilbert space of 
dimension two (i.e. a single qubit). Consider the Hermitian operator Y G L(IK ® C) 
given by 

Y = ^X®|0)(1| + ^X*®|1)(0|. 



Notice also that || Y||(j. = ||X||jj. = 1 by Lemma 3.10 



As observed above, the condition that O is the difference of two complete positive 



transformations implies that 0(X*) — 0(X)* forallX. Using this, as well as Lemma 3.10 



;0 ® W)(Y)|lt. = l\\{^® I:k)(X) ® |0)(1| + ((D ® I:k)(X*) ® |1)(0|||^, 

= \\\{^® I:k)(X) ® |0)(1| + (CD ® I^)(X)* ® |1)(0|||^, 
= ||(a)®I:K)(X)||t, 

= lloiL. 
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This implies that the maximum is achieved on a Hermitian matrix Y. 

It is not hard to see that this implies that the maximum is achieved on a pure state. 
To do so, note that since Y is Hermitian, it has a spectral decomposition. Let such a 
decomposition be given by 

Y = ^A||x|;t)(i|;t|, 

i 

where {lij'i)} is an orthonormal basis of eigenvectors with real eigenvalues {A|}. In 
addition, because || Y || j^. = 1, it is the case that Y.i IMl = 1- By the linearity of O, as well 
as the triangle inequality and the homogeneity of the trace norm, 

II (O ® Ijc^e)(Y) lit, ^ }^ lA^I II (O ® l5c®e)(|-^i)(-^il) lltr • 

Because | At | = 1, it follows that at least one term in the average achieves the bound, 
i.e. that 

||((D® I:K®e)(I^Pi)(ipil)|ltr > ll^llo 



for some value of i. For this value of i we have, by Theorem 3.14 

IKO® Ijc^e)(|-4^i)(-4^il)|ltr ^ ll^llo' 

which implies that || (O l5C(g)e)(l4'i)(4'il) lltr ~ II ^ Wo required. □ 

This theorem does not hold for the trace norm on super-operators, by an example 
due to Watrous [WatOSJ. It may seem odd that in the proof of this theorem the 
space 3^ = "K C has larger dimension than is required to achieve the maximum, since 



dim "K ® G = 2 dim Ji. By examining the proof of Theorem 3.14 however, it can be seen 
that this need not be the case. Applying this theorem in the case that the maximum 
is Hermitian produces a Hermitian state in D(J{ ® !K) that achieves the maximum. 
Applying the convexity argument made at the end of the proof of Theorem 3.17| yields 
a pure state in CK ® 3i on which the maximum is achieved. 

One convenient consequence of this theorem is that the diamond norm of any 
quantum channel is equal to one. This is implied by the definition, since for any 
O G T(!K, %) the theorem implies that there is a state |i|)) such that 

||0 II, = II (O ® I:^)(|i^)(iH) lit, = II pllt, = 1, (3.25) 

where p is the density matrix that results from applying O ® Igr, which has trace norm 
one because it is normalized. An alternate proof of this fact, not using the above result. 



can be found in |AKN98i . 
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There is one further result on the diamond norm demonstrated in Section 
This result is a procedure for polarizing this norm, in the sense that if the diamond 
norm of the difference of the input channels is small, the result is two channels with 
extremely small norm. Similarly, if the input channels have large norm, then the 
resulting channels are almost perfectly distinguishable. Results of this type can have 
powerful applications for error reduction. The discussion of this result is postponed 
until Section 3.7 so that the fidelity can be introduced, as it is used in the proof of the 
polarization result. The fidelity is the topic of the next section. 



3.6 Fidelity 

One of the most important tools in quantum information is the fidelity, which provides 
a way to determine how close two states are together. For pure states and |4)) the 
fidelity has a simple expression 

F(|ilj),|ct))) = |(x|;|ct))|. 
This can be generalized to the case of mixed quantum states p, cr G D( J{) by 



F(p,(T) =tr J v/pa^^. (3.26) 



This quantity and its generalization to mixed states are due to Uhlmann ||Uhl76| . The 
fidelity ranges between F(p, ct) = 1 when p = ff and F(p, cr] =0 when p and cr have 
orthogonal support. The remainder of this section is a survey of some of the most 
important properties of the fidelity. A more complete introduction to this quantity can 
be found in IINCOOII . 



One property that is convenient to show from Equation (3.261 is multiplicativity 
with respect to the tensor product. Following |Joz94[ , this is an easy consequence of 
the fact that i/p ® a = ^/p ® ^/a. This implies that 



F(pi ® P2, Ci ® 0-2) = tr \ VPi ® Pilo"! ® (J2) VPi ® P2 



= l^tr y y/p^ai a/pTJ (^tr y ^/p^a2 a/p^ 
= F(pi,cTi)F(p2,(T2). (3.27) 



This is one of the few properties that is easy to prove from Equation ( 3.26| >. For 
example, it is not clear from this equation that F(p, cr] = F(cr, p). This property follows 
directly from a characterization of the fidelity known as Uhlmann's theorem. This 
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characterization is extremely useful, often being used as the definition of the fidelity, 
and is presented as the following theorem. 



Theorem 3.18 (Uhlmann's Theorem ||Uhl76i ). Let p, ff G D( and let X be any space 
large enough to admit purifications of p and a. Then 

F(p,o-] =max{|((Mi|;)| : \<\>)^^\>) e'K®%Ar^\<\>){<\>\ = p,trac \^\>){^\>\ = cr}. 



Uhlmann's theorem restricted to the finite dimensional case (once again, the cited 
result is much more general than has been applied here) allows the derivation of 
several nice properties of the fidelity. These properties are summarized in the following 
proposition, many of which are observed by Jozsa | |Joz94| l. 

Proposition 3.19. For p, cr G D(IK) and U G U(J{, the fidelity satisfies 



(a) ^ F(p,o-) ^ 1 

(h) F(p,a)=F(o-,p) 

(c) F(UpU*,U0U*) =F(p,0) 



(d) For p, ff G D(J{ O %], F(tr3c P,tr3c cr] ^ F(p, a] 



Proof. All four of these properties are simple corollaries of Theorem 3.18 Properties ([a]) 
and (|b]) follow immediately. Property Q follows from the fact that if |i|>), |(|)) G 'K®3^ 
are purifications of p and a achieving the maximum in the theorem, then (U ® ll3^)|i|>) 
and (U ® lg-)|c|)) are purifications of UpU* and Ucrll*, respectively, and 



|(il;|(U®l:r)*(U®l:^)|ct))| = |(il;|ct))|=F(p,o-]. 

Property (jdj is a consequence of the fact that if iTj)) G 'K®%®J isa purification of p, 
then it is also a purification of tr3<: p. □ 



One further property of the fidelity will be quite important: the monotonicity under 
the application of a quantum channel. We have seen two special cases of this, unitary 
operations and the partial trace, as part of the previous proposition. Extending these 
cases to the set of all channels is a simple consequence of the Stinespring representation 
for channels. 



This proof is due to Josza | Uoz94[ (see also IINCOOl ), though it is not difficult to 
derive from Theorem 13. 18[ 
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Theorem 3.20. Let O e T{'K,X], then for any p, cr G D(J{) 



Proof. By Theorem 3.18 let |4)), \\\)) G J{ S^be purifications of p and a such that 

F(p,ff) = |(ct)|i|;)|. 

Additionally, let O have a Stinespring representation given by 0(X) — tr^ U(X (g) 
|0)(0|)U*. For brevity, let U = U O Is-. Notice that U|c|))|0) purifies 0(p) and that 



U|'4))|0) purifies 0(ct). Using this notation along with Theorem|3.18 



F(0(p),0(a)) ^ F(U(|ct))(cl)| ® |0)(0|)U*,U(|i1j)(x|;| ® |0)(0|)U*) 
= |(0|((t)|U*U|il;)|0)| = |(c|)|il;)| =F(p,ff), 

as required. □ 

The special case of the monotonicity of the fidelity for the partial trace (item ([d]) in 
Proposition 3.19[ > will be particularly important. This case implies that the fidelity can 



only increase when it is taken over only part of a system, where the remainder of the 
system has been traced out. This property will be essential to the results in Chapter |4| 



3.6.1 Relation to the trace norm 

The fidelity and the trace norm are two of the most useful quantities for determining 
how close two quantum states are to each other. Despite the similarities between them, 
it is often much more convenient to work with one or the other of these quantities, 
and so relationships between them are very useful. This section presents two such 
relationships that we will make use of later in the thesis. 



Uhlmann's Theorem (3.181 can also be used to characterize the fidelity using the 
trace norm. This result is not hard to prove, but will be central to a couple of proofs 
that appear later. 

Lemma 3.21. Let p, £, G D(J{). Then for arbitrary purifications 14)) e 3i A of p and 
respectively, we have \\ tr^n lU') (4^1 lltr = F(P' 
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Proof. Using Lemma 3.7 together with Theorem 3.18 and the fact that all the purifica 



tions of p, a are unitarily equivalent using a unitary on the space A, we have 

||tr:K|i|^)(cl)|||t, = max |tr (tr^K |-i|^)(cl)|) U| 

ueV(A) 

= max \tr\^]>){(^\[t:K®U)\ 

ueu(yi) 

= max \{(t)\{l:K®U]\^)\ 
= F(p,£,) 

as claimed. □ 

A very useful relationship between the fidelity and the trace norm is given by 
the Fuchs-van de Graaf Inequalities that relate the trace norm and the fidelity. These 
inequalities show that, up to polynomial factors, the fidelity and the trace norm are 
equivalent. This is helpful, since it is often much easier to work with one or the other 
of these quantities. 

Theorem 3.22 (Fuchs and van de Graaf l|FvdG99l ). For any p, cr G D['K) 



1 



1 - F(p, a) ^ - II p - 0||t, ^ VI - F(P/ ^V- 

The second inequality is not hard to prove. Following |NCOO | | , let Icj)), |i|^) e'K^A 
be purifications of p and cr achieving the bound F(p, ct) = | ((|)|i|)) | in Uhlmann's The- 
orem ( 3.18[ >. Using the monotonicity of the trace norm under the partial trace (Theo- 



rem 



3^, we have 

||p-cT||^, = ||tr^(|ct))(ct)|-|-i|;)(iH)||t, 
^ ||l4))(4)|-|i|;)(-i|;|||^^ 



2V1-I(cl)|^^)l' 



^2v/l-F(p, aP. 

The first inequality is more difficult: it requires either characterizing the fidelity and 
the trace norm in terms of classical variants on the outcomes of measurements, as 
is done in IIFvdG99[ INCOOi , or proving a technical result on the trace norm of the 



difference of two positive operators, as is done in IIKSV02L Neither of these techniques 
are used in the remainder of the thesis, and so the proof of this inequality is omitted. 

The Fuchs-van de Graaf Inequalities may be equivalently rephrased in terms of 
upper and lower bounds on the fidelity in terms of the trace norm. These bounds are 



^ II p - ff lit, ^ F(p, a) ^ Jl - ^ II p - (3.28) 
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and can be derived by simple manipulations of Theorem 



3.22 



3.6.2 Maximum output fidelity for channels 

The fidelity can be extended to quantum channels in much the same way as the 
previous quantities that we have considered. The maximum output fidelity of two 
channels Oi, O2 G T(J{, X) can be defined as 

^ max (01,02)= max F(Oi(p), 02(a)] 
p,ffeD(:K) 

This quantity will is essential to the Close Images problem that is the focus of Chapter|4| 
The property of primary importance for this application is the multiplicativity of Fmax 
with respect to the tensor product of two channels. This will be essential for error 
reduction on instances of Close Images. This result is used implicitly by Kitaev 
and Watrous |KWOO| , and the main thrust of it can also be found in |KSV02| (see 



Problem 11.10). Due to its importance, a complete proof is presented here. The method 
of proof used here is due to John Watrou^ though it is similar to the techniques used 
in IIKWOOi rKSV02| . This proof makes use of the diamond norm, and specifically the 
multiplicativity of the diamond norm with respect to tensor products, which was 
introduced in Section 1331 

The first part of the proof is a relationship between the maximum output fidelity of 
two channels and the diamond norm of a certain completely positive super-operator. 

Lemma 3.23 (Kitaev and Watrous BKWOOl ). Let 0,¥ G T[%,X), and the linear map 
V: L(J{) L(S) be given by 

0(X) =tr2 UXU* 
W[X] = trs VXV* 
r(X) =tr3c UXV*, 
where U, V G U(CK, S ® 3C). Using this notation, 

Fmax(0,¥) = ||r||,. 

Proof. Let 71 be a space with dim 71 = dimJi to allow purifications of states in D(J{), 
and let U = U (g) lyi and V = V (g) lyi, then 

Fn,ax(^>,^)= max F(0(p),¥(o-)) 

p,aeD(5€) 

= max F(tr^^a3 U|ct))(cl)|U*,tr^^3 y\^\>){^\>\y*), 



^John Watrous, private communication 
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where \^) and are purifications of p and a. Applying Lemma 3.21 to this, since 
U|(j)) purifies 0{p) and V|i|)) purifies ^)cr), results in 

max ||tr3<:U|cl))(x|;|V*|| = max || (F ® U) (|ct))(iH) ||,^ = || r||, , 



where the last inequality is an application of Proposition 3.12 



□ 



Using this lemma the desired result on the multiplicativity of Fmax follows im- 
mediately from the multiplicativity of the diamond norm with respect to the tensor 
product. 

Theorem 3.24 (Kitaev and Watrous IIKWOOII ). ForanyOi,Wi G T['K,%] and any (^2,^2 e 

max max 

i=l,2 

Proof. Let Oi(X) = tr^ UiXU?, ^1 = tr^ ViXV* be Stinespring representations of the 
channels 0|,^i for i = 1,2, where for notational convenience the introduction of 
ancillary qubits has been merged into the isometries lit and Vt. Then, setting Vi[X] = 



tvx UiXV| we are in exactly the situation of Lemma 3.23 Applying this lemma, as well 
as the multiplicativity of the diamond norm, gives 

Fmax(Oi®02,¥i®¥2) = || Ti ® r2 ||^ = || Ti ||J| r2 ||^ = F^ax ( Oi, ¥i ) F^ax( (^2, ^2), 

as claimed. □ 



3.7 Polarization of the diamond norm 

This section describes a method for "polarizing" the diamond norm of two channels. 
This is a technique that, starting with two channels Oi, ^2 G T{'K,X), and constants 
< b < a < 2 such that 2b < a^, creates channels Wi and ^2 satisfying 

||(Di-(D2|lo^b =^ ||¥i-¥2||, ^2-'^ 
||a)i-(D2|L^a ^ ||¥i-¥2||^^2-2-\ 

The constructed channels ¥1 and ^2 belong to T(^®^,X®^), where r G 0(k), i.e. the 
size of the resulting channels depends only linearly on the error parameter k. This 
provides a powerful technique for reducing the error in many settings. It provides one 
way to see that any promise problem defined with a promise on the diamond norm 
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difference of two channels can be reduced to the same problem with a weaker gap, 
since the instance with the weaker gap can be polarized using this technique. The 
method is not perfect, however, as it depends on the technical condition that 2b < a^. 

This construction generalizes the polarization technique of Sahai and Vadhan for 
the case of the ^l norm of efficiently samplable probability distributions |SV03| . This 



construction was generalized by Watrous to the case of quantum states that can be effi- 
ciently prepared [Wat02J. The further generalization given here to quantum channels 
does not require any conceptual changes: the details work out in almost exactly the 
same way as in the case of states. This result is the product of joint work with John 



Watrous, and has been published in |RW05| . 



In order that the polarization technique is useful in the setting of computational 
hardness it must satisfy one further significant property. The construction must be 
efficient. That is, given access to polynomial-time circuits (or black boxes) for the 
original channels Oi, O2, circuits that implement the output channels ^1 and ^2 can be 
efficiently constructed. That the polarization technique has this property will be easy 
to observe from the construction given in the proof. 

The proof of the polarization theorem makes use of two constructions. One of these 
constructions increases the diamond norm and the other reduces it. Applying these 
constructions in the correct sequence will result in transformations with the desired 
properties. These two constructions mirror the proof of the classical result due to Sahai 
and Vadhan ||SV03| . 

The first construction is a technique for increasing the diamond norm of two chan- 
nels. The idea is simple: it is much easier to distinguish k copies of the channels 
than it is to distinguish one copy. The channels constructed using this procedure are 
simply Of^. The argument must be carefully made, however, to show that entangle- 
ment across the multiple uses of the channels does not increase the diamond norm too 
much. The following direct product lemma gives bounds on the diamond norm for 
the difference of k copies of the two channels. 

Lemma 3.25. Let Oi,(D2 G T(IK, have ||Oi — 02||^ = 5 > 0. Then for any positive 
integer k 

2-2e=^ < ||(Df'^-0®'^|| ^k5. 

II 1 ^ 1 1 o 

Proof. To prove the first inequality, let p G ® %] achieve the maximum in the 
diamond norm, i.e. let 

|K(Di ® I:k)(p) - (O2® I:K](p)|ltr = ll^i - ^2|lo = 6- 
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Such a state exists by Theorem 3.17 As the trace norm is multiplicative with respect to 
the tensor product (by Equation 3.101, 11 p®'^ |L = || p||t^ = 1- Evaluating the maximum 



in the definition of the diamond norm on this state, we find that 



(3.29) 



We can then apply the bound from |Wat02 | | to the two states p = (OiCg)Ijc)(p) and ct = 
( O2 ® I ) ( p ) having trace distance 6 to obtain the desired inequality. For completeness, 
the proof of this bound follows. 

Since the fidelity is multiplicative with respect to the tensor product of two states 
(Equation ( |3.27| >) we can use the Fuchs-van de Graaf inequalities (Theorem 3.22[ > to 
obtain 



I p®k _ (y®k 11^^ ^ 2 (1 - F(p®\ CT®'^)) = 2 (1 - F(p, a)^) 

^2-2(^y/l-||p-(T||f74^ =2-2(1 -(5/2)2) 



2\V2 



We can then bound this quantity using the inequality (1 — x)^ < e^^^, which holds for 
all nonzero — 1 < x < 1. This can be verified by taking logarithms and considering a 
Taylor series for ln(l — x). In our case, x = 6/2 < 1, so we have 



2-2(l-(5/2)2)^/' >2-2exp 



-52 k 



2-2e 



Combining this with Equation (3.29) proves the first inequality. 

The second inequality follows by induction on k. The case of k = 1 leaves nothing 
to prove. For k > 1, let = of ^' for simplicity. Using this notation, as well as the 
triangle inequality, we have 



I Of 



= ® (Di -¥2 ® 02\\^ 
= 11^1 (g) Oi -¥2 ® Oi +¥2 Oi -¥2 (g) O2I 
^ II (^1 -^2) ® II, + 11^2 ® (Oi - (D2) II, 
= ||¥i-¥2|L||(Di|l +||¥2|L||Oi-02|L. 



The final equality follows from the multiplicativity of the diamond norm, given by 
Theorem 3.15 Since the diamond norm of any channel is one (Equation ( 3.25[ >), the 
inductive hypothesis implies that 

||¥i -¥2|L llOilL + ||¥2|L ||(Di - (D2IL ^ (k- 1)5 + 5 = k5 



as required. 



□ 
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(a) The circuit Ci 



Q2 



Q2 



(b) The circuit C2 



Figure 3.1: Circuits Ci and C2 output by the construction in Lemma 3.26 Each circuit 
Ci contains r independent copies of the circuit Qi. 

The lemma implies the existence of an efficient procedure to increase the diamond 
norm of two channels, which will form half of the construction used to polarize the 
diamond norm. The key to this procedure is that while this procedure increases the 
norm, it does so much faster when the norm of the original circuits is large. The circuits 



produced by this procedure are demonstrated in Figure 3.1 



Lemma 3.26. There is a polynomial-time deterministic procedure that, on input (Qi, Q2, 1^), 
where Qi, Q2 are descriptions of mixed-state quantum circuits, produces as output descriptions 
of two quantum circuits, (Ci, C2) satisfying 

2 -2exp (-^ II Qi - Q2IIJ) < ||Ci - C2L ^ r IIQi - Q2L. 

Proof. For i = 1, 2, the circuit Ci is constructed from r parallel copies of the circuit Qi. 
This results in Ci = Qf ^/ so that the bounds in the statement of the lemma follow from 
Lemma [3.251 □ 

This procedure to increase the diamond norm of the difference of two channels is 
used in Chapter [6| as it preserves the degradability or antidegradability of the input 
channels. This will not be true of the remainder of the polarization procedure. 

The second procedure that is used in the polarization construction is used to reduce 
the diamond norm of the difference of two channels. Before outlining the procedure, 
however, we prove the following simple property of the norm. 
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Proposition 3.27. Let (Di, O2 G T(J{,3C) and ¥1,^2 G T(:r,g). Lrf 

1 1 

El = -Oi®¥i + -(D2®^2, 

1 1 

E2 = -Oi ®¥2+ -CD2®¥i. 

T/zen llEi - E2IL = i ||Oi - (D2IL \m-W2\L. 



Proof. The diamond norm is multiplicative with respect to tensor products (Theo- 
rem |3[T5]), so that 



.(01-02)® (¥1- ¥2] 



Ol-02|L||^l-^2| 



as required. 



□ 



This property is useful in the proof that the technique for reducing the diamond 
norm works correctly. The idea behind this procedure is that even if Qi and Q2 are 
easy to distinguish, then the channel Ci is constructed by taking the tensor product 
of r channels, each chosen from {Qi, Q2} uniformly at random, with the restriction 
that Qi appears an odd number of times should be very hard to distinguish from 
the channel C2 constructed in the same way, except that Qi appears an even number 
of times in C2. In effect, a procedure that distinguishes Ci and C2 must succeed for 
all r embedded channels: this is because the goal is to determine the parity of the 
number of times that Qi appears, and the parity is affected by even a single mistake 
made by the distinguishing procedure. This construction mirrors that used on states 
in [Wat02J, which itself mirrors that used on probability distributions in |SV03[ . The 



circuits produced by this procedure are illustrated in Figure 3.2 



Lemma 3.28. There is a deterministic polynomial-time procedure that, on input (Qi, Q2, 1^), 
where Qi, Q2 are descriptions of mixed-state quantum circuits, produces as output descriptions 
of two quantum circuits (Ci, C2) satisfying 



Ci 



C2|L = 2 



IQ1-Q2I 



Proof. We use the circuits Ci and C2 outlined above. The circuit Ci performs the 
transformation defined as 



c, = 



xi,...,x,e{i,2} 



Q 



Qx. 



El (mod 2) 
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Q 



X2 



Q 



X2 



odd 



^ yevei 



(a) The circuit Ci 



(b) The circuit C2 



Figure 3.2: Circuits Ci and C2 output by the construction in Lemma 3.28 The circuit 
Ci consists of r — 1 independent circuits Qxi, each chosen randomly at run- time from 
{Qi, Q2}, and one final circuit chosen so that the parity of the indices of the chosen 
circuits is odd in the case i = 1 and even in the case that 1 = 2. 



while C2 performs a similar transformation defined as 

1 



C2- 



)r-l 



Qx,.- 



xi,...,XTe{o,i} 

xiH hxr=0 (mod 2) 



These circuits run r copies of Qi and/or Q2 in parallel, where the choice of Qi or 
Q2 determined uniformly at random subject to the constraint that Ci applies an odd 
number of copies of Qi while C2 applies an even number. Such circuits may be 
constructed in time polynomial in the sizes of Qi and Q2 by using ancillary qubits 
with Hadamard and dephasing gates to generate the randomness. 



A proof by induction based on Proposition 3.27 establishes the desired equality. 
This proof is included here for completeness. The base case, r = 1, leaves nothing to 
prove. Let r > 1, and let Di, D2 be the channels Ci and C2 for the case r — 1. Notice 
that 



IC1-C2 



Qi ® D2 + Q2 ® Di - Qi ® Di - Q2 ® D2I 



which can be observed from the construction of Ci and C2 by considering the case 
that the first transformation is Qi or Q2 and applying the parity conditions. Applying 



Proposition 3.27 to this, we have 

l|Ci-C2lL = J||Ql- 



Q2||J|Dl-D2 



Q1-Q2I 
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where the last equality is by the induction hypothesis on || Di — D2 1| 



□ 



These two constructions, taken together, suffice to prove the polarization theorem. 



The proof consists of an application of Lemma 3.28 followed by an application of 



Lemma [3. 26 followed by one more application of Lemma 3.28 The proof is intuitively 
simple, though it is quite technical: the value of r used in each transformation must 
be chosen very carefully. 

Theorem 3.29. Let the constants a, b G (0,2) satisfy 2b < a^. There exists a deterministic 
polynomial-time procedure that, given input (Qi, Q2, 1"^), where Qi and Q2 are mixed-state 
quantum circuits, outputs quantum circuits (Ci, C2) such that 



Qi-Q2|lo^b 
Qi-Q2|lo^a 



Ci-C2|L<2— 
|Ci-C2|L>2-2- 



Proof. First, we apply Lemma 3.28 to the input (Qi, Q2, V), with 



r= [log(16n]/log(aV(2b))]. 
This result in circuits (Q^, Q2) such that 



Q2 



C b 



|Qi-Q2|L^a 



I Q{-Q^ II, ^2(b/2)^ 
|Q{-Q^||,^2(a/2)\ 



Next, we apply Lemma 3.26 to the input [Q[, Q2, 1*), where 

s = L(b/2)-V4j . 
This procedure produces circuits (Q", Q2 ) satisfying 

^ II Qi - Q2 llo ^ 2(b/2)^(b/2)-V4 = 1/2, 



||Qi-Q2|L^b = 
||Qi-Q2|L^a = 

The last inequality is due to the fact that 



Qr-Q^'IL>2-2exp(--(a/2) 



2ti 



>2-2e 



-2TL + 1 



|(a/2)2^ + l^l(b/2)-(a/2)2^^^(^ 

where the +1 term on the left is due to the floor in the definition of s. Taking logarithms 
of both sides, this is 



log(|(a/2]2^ + l) ^log^(^) >y^o^^-3 



logl6n 



logaV(2b) ^2b 
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This implies that 2 — 2 exp(— | (a/2)^^) ^ 2 — 2e ^n+i^ required. 

Finally, applying Lemma 3.28 once more, this time to (Q", Q2 , 1^), where 

t=r(n + l)/2l, 

we obtain circuits (Ci, C2) such that 

||Qi-Q2lL^b =^ ||Ci-C2|L^ (l/2)t^+i'/2(l/2)'^-i'/2=2-" 

IIQ1-Q2IL >a ^ ||Ci-C2||^ > (2-2e-2-+i)r(n+i)/2l(i/2)r(n+i)/2i-i ^2-2— . 

The final inequality is due to the fact that by Bernoulli's inequality 

2(1 - r(^+i)/2l ^ 2 (1 - [(n + l)/2] e^^^+i) > 2 - 2-^^. 

The circuits (Ci, C2) have size rst times the size of the original circuits (Qi, Q2). By 
inspecting these quantities we find that r,t G 0(n) and s G 0(n'^) for c a constant 
depending on the constants a, b. This implies that the construction can be implemented 
in time polynomial in n and the size of the original circuits. □ 

3.8 Conclusion 

In this chapter several different measures on quantum states and channels have been 
introduced. Many of the important properties of these measures have also been 
defined. This forms the basis for the remainder of the thesis: the quantities described 
here, and their properties, will find use throughout the problems studied later. For this 
reason it is hoped that this chapter will stand as a useful reference for these concepts. 

There are two new results contained within this chapter. The first of these is the 
theorem in Section 13.5.11 that demonstrates that the maximum in the diamond norm 
on the difference of two channels is achieved on a pure quantum state, as opposed to 
a general linear operator. This property will be extremely useful when the diamond 
norm is later used as a way to quantify the distinguishability of two quantum channels, 
which is the central problem considered in this thesis. The second new result is the 



polarization technique for the diamond norm in Theorem 3.29 Part of this result 
is used in Chapter |6] to reduce the error in the reductions of the distinguishability 
problem to the degradable and the antidegradable channels. Both of these results are 
joint work with John Watrous BRW05I . 
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Chapter 4 

The Close Images Problem 



Given two quantum channels it is natural to ask how close the outputs of the two 
channels can be. When these channels are given as mixed-state quantum circuits this 
becomes the computational problem Close Images that is considered in this chapter. 
This problem is QlP-complete, as it is a restatement of the definition the complexity 
class. This result is due to Kitaev and Watrous [KWOOJ, but it is included here because 
it is the result that all of the other hardness results in the thesis depend on. 

The main result of this chapter is that restricting this problem to input circuits of 
logarithmic depth does not reduce the computational difficulty. This is shown by 
constructing a reduction from an instance of Close Images to an instance on log-depth 
circuits. This provides further evidence for the computational power of log-depth 
circuits. The reduction that proves this result involves a simulation of the two input 
circuits by log-depth circuits. The maximum output fidelity of the constructed circuits 
is related to the maximum output fidelity for the original two circuits, so that this 
reduction preserves the structure of the close images problem. 



The results in this chapter on log-depth circuits have been published in |Ros08b | 
Contents 
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4.1 Log-depth mixed-state quantum circuits 



A significant practical problem in quantum information is that quantum systems 
quickly decohere when allowed to interact with the environment. This process severely 
limits the length of quantum computations that can be experimentally realized. Short 
quantum circuits provide a model of computation that can capture the kinds of com- 
putation that we can perform under this type of time limit. For this reason it is of 
significant interest to find short quantum circuits for important problems. 

Log-depth quantum circuits have been found for several significant problems in- 
cluding the approximate quantum Fourier transform IICWOOII and the encoding and 
decoding operations for many quantum error correcting codes IIMN02I . In addition to 
these applications, a procedure for parallelizing to log-depth a large class of quantum 
circuits is known [BK09J. These examples demonstrate the surprising power of short 
quantum circuits. It has been conjectured by Jozsa that any quantum algorithm can 
be performed with logarithmic quantum circuit depth interspersed with polynomial 
time classical computation |Joz06[ . 

The standard circuit model of quantum computation is the unitary circuit model 
applied to pure quantum states. In this thesis we consider the more general model of 
mixed-state quantum computation introduced in Section 2.1 While much of the pre- 
vious work on short quantum circuits has been in the unitary circuit model [FGH ZOSl 
|GHMP02J, there has also been work outside of this model ||TD04i . The primary ad- 
vantage of considering this more general model is that the mixed state model is able 
to capture any physically realizable quantum operation, and so results on this model 
may have implications for experimental quantum information. 

In this chapter it is shown that the apparent power of short quantum computations 
comes with a price: the close images problem on logarithmic depth quantum circuits 
is exactly as difficult as the general problem on polynomial depth circuits. This result 
will be used, in Chapter |5| to show that the problem of distinguishing mixed state 
circuits is also no easier when restricted to log-depth circuits. 

The remainder of this chapter is organized as follows. In the next section, the close 
images problem is discussed, and the result due to Kitaev and Watrous IIKWOOII that the 
close images problem is complete for QIP is detailed. In Section 43] a key component 
of the reduction to log-depth circuits is considered: the Swap Test. This procedure can 
be used to ensure that two pure quantum states are close together, and as such is a 
key component of many quantum algorithms. In Section 4.4 a Karp reduction from 
the polynomial depth to logarithmic depth versions of the close images problem is 
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presented in detail. The correctness of this reduction is shown formally in Section 



4.5 



4.2 QIP completeness of close images 

In this section an overview is given of the close images problem as it relates to the 
complexity class QIP. Close Images is essentially a restatement of the acceptance 
condition for the verifier in a quantum interactive proof system, and so it will be 
important to review this connection, as this connects all of the other computational 
problems studied in the thesis to the class QIP. 

In order to model the hardness of the class of problems having quantum interactive 
proof systems, Kitaev and Watrous introduced the close images problem |KW001 . This 
problem can be given the following formal definition. 

Problem 4.1 (Close Images). For constants < b < a ^ 1, the input consists of 
quantum circuits Qi and Q2 that implement transformations in T(CK, %). The promise 
problem is to distinguish the two cases: 

Yes: F(Qi(p), QilO ) ^ a for some p, £, G D(J{), 
No: F(Qi(p), Q2(f,)) ^ b for all p, £, G D(J{). 

This is simply the problem of determining if there are inputs to Qi and Q2 that 
cause them to output states that are nearly the same. It will be helpful to abbreviate 
this problem as CIq^b when the constants a and b will be significant. It is the aim of the 
present chapter to prove that this problem remains complete for QIP when restricted 
to circuits Qi and Q2 that are of depth logarithmic in the number of input qubits. This 
will be achieved in the case of perfect soundness error, i.e. a = 1 in the above problem 
definition. As discussed below, this problem remains complete for QIP in this case. 
This restriction serves only to simplify the problem, as distinguishing the two cases for 
a weaker promise can only be more difficult, so a hardness result on this case will also 
imply the hardness of the more general problem. For the sake of brevity, the log-depth 
version of this problem will be referred to as Log-depth CIa,b and since this problem 
is a restriction of a problem in QIP, as argued below, it is clear that it is also in QIP. 
Similarly, the abbreviation Const-depth CIa,b will be used to denote the version of this 
problem on constant-depth circuits. 

Although this problem was introduced by Kitaev and Watrous to show that QIP C 
EXP, it was not explicitly defined in |KW00n . For this reason the reduction from the 
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Figure 4.1: The operations and Hilbert spaces corresponding to a three message quan- 
tum interactive proof system. 

value of a quantum interactive proof system to the close images problem is repeated 
here. The hardness of this problem is significant for the thesis: it is from this reduction 
that all of the other QlP-hardness results in the thesis follow. 

Recall that, due to the results of Kitaev and Watrous IIKWOOL any quantum inter- 
active proof system can be parallelized to three messages. For any input x, this results 
in unitary circuits Pi, Vi, P2, and V2 acting on systems V, M, J* for the verifier's private 
space, the message space, and the prover's private space respectively. These spaces 



and transformations are illustrated in Figure 4.1 Recall from Section 2.2 that these 
circuits depend on the input, but the verifier's circuits Vi and V2 must be generated in 
polynomial time in the length of x. For this reason the input string does not appear in 
the description of the protocol: it is "hard-coded" into the circuits of the two parties. 
As the verifier accepts depending on the result of a measurement on one of the message 
qubits at the end of the protocol, the value of the quantum interactive proof system 
for fixed transformations Pi is given by 

tr[n(V2oP2oVioPi(|0)(0|))], 

where TT is the projector onto the verifier's accepting subspace. The prover can make 
the verifier accept if there exist circuits Pi and P2 such that this probability is large. 

To reduce this problem to an instance of CI we must find transformations Qi and Q2 
that have close images if and only if the verifier can be made to accept. The construction 



of these two transformations is outlined in Figure 4.2 The transformation Qi is the 
first half of the protocol, consisting of the unitary circuit Vi applied to the prover's 
first message. The transformation Q2 represents the second half of the protocol run 
in reverse: starting from an accepting state |1)(1| ® o" and performing Vj, which is 
the inverse of the unitary circuit V2. These transformations are given in |KWOO | | more 
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Figure 4.2: Construction of the circuits Qi and Q2 in the reduction from three message 
quantum interactive proof system to an instance of Close Images. The space 7 of the 
prover's private system is not shown and p represents the prover's first message. 

formally as 

Qi(p)=trMVi(|0)(0|®p)Vi*, 
Q2((T)=trMV*(|l)(l|®(T)V2. 

These transformations do not take inputs of the same dimension, but this can easily 
be fixed by padding the input space of Qi with qubits that will later be traced out. 
The idea is that the verifier will accept in this protocol if and only if there are states 
p and a that are consistent with a transcript of the QIP protocol where the verifier 
accepts. In this case, these states exist if and only if the verifier's private qubits do not 
change between Vi and V2, which happens exactly when there are p and a such that 
F( Qi ( P)/ Q2 ( cr) ) is large. To see this more formally, if the verifier accepts with certainty, 
then the output of Qi on the input state the proof system is exactly the output of Q 2 on 
some accepting configuration of the proof system. This is because there is a strategy 
for the prover that both causes the verifier to accept with probability one and does not 
change the state of the verifier's private space (because this is not allowed in the QIP 
model). Therefore, if the verifier accepts with certainty, there exist p, cr such that 

Qi(p) = Q2((t), 

so that we have a valid instance of CIi ^ in this case. 



To see this more formally, the following lemma of Kitaev and Watrous |KWOO| 
characterizes the probability that the verifier can be made to accept in a three-message 
quantum interactive proof system. 

Lemma 4.2 (Kitaev and Watrous IIKWOO ff). Let Vi and V2 describe a verifier in a quantum 
interactive proof system, as shown in Figure 4.1 and let Qi and Q2 be as given in Equation ( |4.1| >. 
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The maximum probability that the verifier can be made to accept is 

Proof. By the definition of the model, the acceptance probability is given by a projector 
TTacc onto the subspace of the verifier's private qubits with the first qubit in the state 
|1). By re-adding the prover's private space we may assume that all states during 
the protocol are pure.. The maximum acceptance probability is defined as a maximum 
over the prover's strategy P and the initial state 14)) corresponding to the prover's first 
message, which is in the |0) state for all but the message qubits and the prover's private 
qubits. Doing so, this quantity is 

maxtr(naccV2PVil4))(4)|Vi*P*V2*] = max I ((iWHacclV,* PVilcj)) |^ 

= max |trP(Vi|(t))(^|V2)|^ (4.2) 

where is restricted to be an accepting state, i.e. the first of the verifier's private 
qubits is in the |1) state, and \<^) is restricted to be an initial state, i.e. the verifier's 
private qubits are in the |0) state. If we first trace out the space V in this equation, then 



by Lemma 3.7 the resulting quantity is equal to 



max |trtrvP(Vi|ct))(-vjV2)|^ = max ||trv Vil4))(-v|V2 = max Utr^vt^y Vi|(ti)(-v|V2||t^, 

where the final equality follows from the fact that the complementary reduced states 
of a pure state have the same singular values. Combining this with Equation ( 4.2| > 
implies that the verifier can be made to accept with probability 

max ||trjvc®y Vi|(t))(-v|V2||f, = maxF(Qi(p), Q2(ct))^ = F^naxlQi, Qi)^ 



where the first equality is by Lemma 3.21 Recall that the state Icj)) = |0) (g) 14)') is a 



valid initial state and that |"v) = |1) ^ |"v') is a valid accepting state for the verifier: these 
conditions on the two pure states conform exactly to the states in the definition of Qi 



and Q2 in Equation ( |4.1[ ). □ 



This lemma implies directly that CIa,b is QlP-hard for any probabilities a, b that 



suffice for the definition of QIP in Section 2.2 As it is known that these parameters 



may be any values such that < b < a ^ 1 with at least an inverse polynomial gap 
between a and b, this implies that CIa,b is hard for these same values. 

To see that this problem is in QIP, consider the following protocol for the veri- 
fier, due to Kitaev and Watrou^ The verifier starts with two circuits implementing 

^John Watrous, private communication 
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transformations Qi and Q2 with 



Qt(p)=tra3 Ut(p®|0)(0|)Ut. 

As a first step, the prover sends a state p, promised to be a such Qi ( p) is close to a state 
in the image of Q2. The verifier computes 

Ui(p®|0)(0|]U* 

and sends the part of the state in B to the prover. If there is a state a such that 
Q2(o') = Qi(p)/ then the prover and verifier together hold a purification of this state. 
In this case, the prover can apply a unitary to his portion of the system to obtain a 
state corresponding to the purification that would have been obtained had the verifier 
instead evaluated 

U2((T® |0)(0|)U*. 

The prover performs such a computation, and sends the state in B back to the verifier, 
who applies U| and checks to see that the result is a valid initial state (i.e. his private 
qubits are in the |0) state). 

The above argument implies that when Q 1 ( p ) = Q2 ( o') the prover can succeed with 
certainty. In the general case, the maximum probability that the verifier can be made 



to accept is given by Lemma 4.2 which in the case of this proof system is exactly 
Fmax(Qi/ Qi]^- This argument shows that CIa,b is in QIP for all < b < a ^ 1 with at 
least an inverse polynomial gap between a and b. 

The preceding arguments imply that problem is complete for QIP. This argument 
appears implicitly in BKWOOl , where it is used to show that QIP C EXP. 



Theorem 4.3 (Kitaev and Watrous BKWOO | | ). For any < b < a ^ 1, ^/le problem Cla^b is 
QlF-complete. 



4.3 The swap test 

The swap test provides a simple way to detect if two pure quantum states are the same. 
It was introduced by Buhrman et al. in the context of quantum communication com- 



plexity BBCWdW01| , but it has also found applications in error correction lBBD+971 



and in the estimation of various properties of quantum states llEAO+021 . Generaliza- 
tions of the swap test to more than two inputs have also been considered IIKNY08I , 
though they will not be needed here. 
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Figure 4.3: A Constant-depth implementation of the W gate. 
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Figure 4.4: A circuit implementing the swap test. 



An essential component of the swap test is the operator W G U(IK ® IK] that swaps 
the states in the two spaces, i.e. W|a)|b) = |b)|a) for all |a), |b) G Oi. Expressing W in 
the computational basis gives 

W = J^|j)(i|®|i)(j|, 

from which it is clear that W is both Hermitian and unitary. 

As circuit depth is one of the primary considerations of this chapter, notice that 
as a circuit on 2n qubits, the operation W can be implemented in constant depth. 
Such an implementation can be given by n independent two-qubit swaps, as shown 



in Figure 4.3 These two qubit swaps can each be implemented in the usual basis of 



quantum gates using three controlled-not gates, as shown in Figure 2.3 of Section 2.1 



The swap test is built using a controlled-W operator to determine how close two 



states are to each other. A circuit performing the swap test is given in Figure 4.4 



An alternate characterization of the swap test is simply as a projective measurement 
onto the symmetric and antisymmetric subspaces of the systems it is applied to. Let 
{|i) : 1 ^ i ^ d} be a basis for 'K. On Ji ®'K, the symmetric and antisymmetric 
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subspaces are defined as 



= span{|i)|j) + j} 

m ® ^)asym = G J{ ® I W\^\)) = -\^\))] 

= span{|i)|i) - : i < j}, 

where these two subspaces represent the ('^2 ^) dimensional subspace corresponding 
to the +1 eigenvalues of W and the (2) dimensional subspace corresponding to the —1 
eigenvalues of W. From this representation it is easy to see that these two subspaces 
make up the whole space, i.e. that 

asym/ 

and that the projections onto these subspaces are given by (1 + W)/2 and (1 — W)/2. 
To see that this formulation of the swap test is equivalent to the circuit presented 



in Figure 4.4 consider the result of the measurement on the control qubit and work 
through the circuit in reverse. If this measurement result is |0), then the state of the 
control qubit after the controlled-W operation is |0) + up to normalization. As this 
is also the state of this qubit before the controlled-W operation, then applying W did 
not change the phase of the system, which implies that the input has been projected 
onto the symmetric subspace. On the other hand, if the measurement result is 
then the state of the control qubit after the controlled-W operation is |0) — In this 
case, the system has been projected onto the subspace where applying W results in a 
phase of —1, which is exactly the antisymmetric subspace of'K 0'K. Thus the circuit 



in Figure 4.4 applies the projective measurement given by (1 + W]/2 and (1 — W)/2, 
exactly as required. 

This characterization immediately gives the probability that the swap test returns 
the antisymmetric outcome when applied to two pure states. To see this, observe that 
on pure states \\\)) and \(^) this occurs with probability 

^tr((l - W)|x|;)(il;l ® \<t>){<^\] - ^ [tr \^\,) ® |(t^)((t)| - tr ® \^\>){(^\] 

^l(i-\{m\')- (4.3) 



This result can be found in IBCWdWOl I. 



The results that follow will make use of a generalization of this equation to the 
case of two (potentially entangled) mixed states. In the process of making this gener- 
alization the square in Equation ( |4.3| > is lost, and so we will only show a lower bound. 
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Notice also that the requirement in the lemma that the two input states be reduced 
states of each other can be made in full generality, by applying the theorem to p = 0® £, 
in the case that the input states are not entangled. 

Lemma 4.4. If p G D(yi (g) S] then a swap test onA®'B returns the antisymmetric outcome 
with probability at least 

1 1 

- - -F(tryi p,tr2 p). 

Proof. Let lij)) Gj^®!B(g)Cbea purification of p, where C is an arbitrary space with 
dim C ^ dim A dim S to allow such a purification. The swap test measures the state 
on yi (g) S with the projectors | (1 - W) and | (1 + W). As W is Hermitian and = 1, 
the antisymmetric outcome occurs with probability given by 

^tr [(lyi^s^e - W® le]|ip)(iH] = ^ - ^(i^W ® lelip). (4.4) 

The operator W is also unitary and so the states \\\)) and (W (8 le)l'>l') each purify 
both tryi,g,e |i|')(xl>| and tr^gie l'>l')(4'l/ and so by Uhlmann's Theorem (Theorem 3.18[ > 



Equation ( |4.4[ ) implies 



I - ^(iHW® le|i|;) ^ ^ - ^F(tr^^e ^iMAr^^e (4.5) 
Finally, by observing that 

tryig^e = tryi(tre \^\>){M) = tr^i p 

and 

tr^^e = tr2(tre = tr^ p. 



Equation (4.5 ) is the lower bound in the statement of the lemma. □ 



To see that this lemma generalizes Equation ( |4.3| > up to a square, consider the input 
state p = Icj)) (cl)| (g) and apply the definition of the Fidelity. 

The main result of this chapter concerns log-depth circuits, and so it is important to 
note that the swap test can be performed in log-depth. As discussed in Proposition |2.1[ 
controlled operations on n qubits can be implemented by adding only log-depth 
overhead. This implies that the swap test can be implemented with a log-depth 



circuit, as with the exception of the overhead for the controlled operation. Figure 4.4 
provides a constant depth circuit. Additionally, this implies that if the unbounded 
fan-out gate is allowed into the model of computation, this overhead can be reduced 
to constant-depth, and so in this case the swap test can be performed with a constant 
depth circuit. As it is not at all clear that including this gate produces a reasonable 
circuit model, any results that depend on the addition of this gate to the circuit model 
are clearly marked with this requirement. 
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4.4 Reduction to logarithmic depth 



In this section the reduction from the general close images problem to the log-depth 
restriction of the problem is described. This is done in the case of one-sided error, i.e. 
the problem CIi b/ where the two circuits either have intersecting images or the largest 
fidelity between any two outputs is at most b. The fact that this restricted version of 
the problem is hard implies that we obtain the desired hardness result even when we 
assume that the input instance has a = 0. 

The general idea behind the construction is to simply slice the circuits of an instance 
of CIi b into constant-depth pieces and run them in parallel. These circuits will have 
much larger input spaces than the original circuit, but they are able to simulate the 
original circuit. This is due to the fact that if for each of the constant depth pieces, the 
input to one piece of the circuit is identical to the output of the previous piece, then 
the output of the final piece of the circuit will be equal to the output of the original 
circuit. This need not be the case if the intermediate inputs are not the outputs of the 
previous pieces, and so additional tests that ensure these inputs are at least close to the 
desired states are required. The swap test will be used extensively to perform these 
tests, though care must be taken to ensure that the resulting circuits have logarithmic 
depth. 

This construction is similar to an idea of Gottesman and Chuang BGC99I in which 
a circuit is sliced into constant depth pieces with teleportation used to transfer the 
information between the pieces. Conditioned on all of the teleportations not requiring 
a Pauli correction, this process produces a constant depth simulation (as a mixed state 
circuit) of the original circuit. This process, however, does not perform any verification 
that the teleportation operations were successful, and so the resulting simulation is 
only accurate with exponentially small probability. This technique was used by Terhal 
and DiVincenzo | TD04[ to show that exactly simulating these circuits in polynomial 
time leads to unexpected complexity-theoretic results (P = PP). The circuit used in the 
paper does not conform to our circuit model, however, as an infinite set of gates is 



allowed into the model. This difficulty was eliminated by Tenner et. al | |FGHZ05| 
who implemented the construction in a circuit model equivalent to the one used here. 
This paper also provides an approximate simulation of the constant depth circuits in 
classical polynomial time, which suggests that extremely short quantum circuits are 
not interesting from a computational perspective. 

For the reduction of Close Images to circuits of logarithmic depth we use a similar 
technique to that of Gottesman and Chuang |GC99| of slicing the circuit into pieces. 
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Figure 4.5: The original circuit Qt decomposed into constant depth unitary circuits. 

In the place of teleportation, however, we demand that the inputs to each of the pieces 
are provided before the start of the computation. These intermediate inputs are then 
verified using a verification procedure to ensure that the constructed circuit faithfully 
simulates the original circuit. 

To describe the reduction, let Qi and Q2 be the circuits from an instance of CIi b, and 
let n be the size of Qi and Q2, by padding the smaller circuit, if necessary. In order to 
slice the circuits into pieces it is assumed that Qi and Q2 first introduce any necessary 
ancillary qubits, then apply local unitary gates, and finally trace out any qubits that are 



not part of the output. This form for a circuit is shown in Figure and, as discussed 



in Section 2.1 this can be assumed with no loss of generality with only polynomial 
overhead, by delaying any partial trace operations until the end of the computation 
and introducing any needed ancillary qubits at the start of the computation. 

A simple way to decompose Qi into constant depth pieces is to simply let each 
gate of Qi be a single piece in the decomposition. Let Ui, U2, . . . , Un be these pieces, 
with the additional complication that the operation Ui both adds the ancillary qubits 
and performs the first gate of the circuit. In a similar way, Q2 can be decomposed into 
constant depth pieces Vi, V2, . . . , Vn,. Such a decomposition is shown in Figure |4.5[ If 
the circuits Qi and Q2 implement transformations in T(3-C, %), then as we have assumed 
that they are in Stinespring form, these circuits first introduce ancillary qubits in some 
space A, apply some unitary in U(IK /l, B (g) 3C), and finally trace out the space B. It 
can be assumed that the spaces A and !B are of the same dimension for both Qi and 
Q2, once again by padding the smaller circuit with unused ancillary qubits that are 
later traced out. This implies that the spaces ^A and 'B 0% are isomorphic. Using 
these spaces, and implicitly this isomorphism, we have 

Ui,Vi gU(J{i,Si®3<;i) 

Ui,Vi GU(J{i®yii,Si®3Ci) for2^i^n, 

where the subscripted spaces are copies of the non-subscripted spaces that hold the 
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Figure 4.6: Testing that the output of is close to the input of Ut+i. The inputs 
are the ideal inputs to Uj, and are labelled for clarity only - no assumptions are made 
about these states. Qubits that do not reach the right edge are traced out, but for clarity 
these operations are not shown in the figure. 

input or output of one of the pieces of the original circuits. As an example of this 
notation, if p G D(!K), then the output of the circuit Qi on p is given by 



tr3,U,U,_i---UipUtU*---U;, 



(4.6) 



and the output of Q2 is given by the same expression with the Vi in place of the 
unitaries U|. 

This decomposition of Qi and Q2 will be used to construct circuits Ci and C2 that 
have logarithmic depth and still, in some sense, faithfully implement Qi and Q2. This 
is done by placing the circuits corresponding to Ui, . . . , 11^ in parallel, and tracing out 
all the qubits that are not in the output of Un- Such a circuit is constant depth, but does 
not necessarily output a state in the image of Qi, as the input to Ui+i is not necessarily 
close to the output from lit. This problem is solved by comparing the output of Ui to 
the input to Ui using the swap test. The swap test will fail to detect the case that the 



two inputs are different with some probability, but in Section 4.5 it is shown that this 
probability can be upper bounded by an expression involving the trace norm of the 
two states. 

In order for this comparison procedure to be done in log-depth an auxiliary input 
is first compared against the input to Ui+i and then held in reserve to compare to the 
output of Ui. This strategy avoids the comparing the input to Ui+i directly to the 
output of Ui, which leads to a circuit of linear depth. This depth reduction comes at a 
cost, however, as the two states are always compared through an intermediary state, 
which can at worst halve the probability of detecting when these two states differ, 
since one test is replaced with two. This constant loss will not affect the main result in 
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Figure 4.7: The outputs of Ci and C2. The dummy |0) qubits of one circuit line up with 
the outputs of the swap tests of the other. 



a significant way. An example of the construction used to ensure that the output of lit 



agrees with the input to Ut+i is given in Figure 4.6 



To simplify the analysis of the constructed circuits these two tests are controlled so 
that exactly one of the two tests is performed. This will increase the failure probability 
by another factor of two, but allows the analysis of each swap test to ignore the effect 
of the other. To implement this scheme a control qubit is used so that either the first or 
the second test is performed between every two pieces Ui, Ui+i of the circuit. If a test 
is not performed, then the value of the output qubit of the swap test is left unchanged, 
and so the result of the test is a qubit in the |0) state. In the case that a test is performed, 
the output is either |0) for the symmetric subspace (i.e. the two states are the same) 
or |1) for the antisymmetric subspace (i.e. the two states differ). These outputs are 
classical values, but they are treated as the two orthogonal quantum states |0) and |1) 
for convenience. Controlled application of these swap tests can be implemented in 



log-depth using the techniques described in Proposition 2.1 



After adding these two tests between each piece of the circuit there is one final 
modification to obtain the circuits Ci and C2. If any of the swap tests fail, i.e. detect 
states in the antisymmetric subspace, then they will output qubits in the |1) state. As 
yes instances of Cli^b have outputs that are close together, we can ensure that if any of 
the swap tests fail then the outputs of the constructed circuits are far apart by adding 
dummy qubits in the |0) state to be compared to the outputs of the swap tests in the 



other circuit. The arrangement of these dummy qubits is shown in Figure 4.7 



The constructed circuits Ci and C2 are obtained by decomposing Qi and Q2 into 
constant depth pieces, inserting the swap tests shown in Figure |4!6| and adding dummy 
qubits to ensure that the swap tests in the other circuit do not fail. The final circuit 



Ci constructed from Qi, including these dummy qubits, is shown in Figure 4.8 the 
circuit C2 is similar, with the exception that the qubits corresponding to the swap test 
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Figure 4.8: The constructed circuit Ci. In the circuit C2 the dummy zero output qubits 
are swapped with the qubits containing the results of the swap tests. All qubits that 
do not reach the right edge of the figure are traced out, but this is for notational 
convenience only: the constructed circuits are in Stinespring form. 
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outputs and dummy qubits have been swapped, as shown in 4.7 At the end of these 
circuits, all qubits are traced out except the output (in the space Xn) of 11^ or Vn, the 
output of the swap tests, and the dummy zero qubits. Notice that the circuit Ci can be 
computed from Qi in polynomial time, as it is simply a rearrangement of the gates of 
the original circuit with the addition of a linear number of extra gates. 

If the outputs of the circuits Ci and C2 are close together, then at an intuitive level 
the output of the swap tests in each circuit must be close to zero and the output of 11^ 
and Vn must also be close. If the swap tests do not fail with high probability (i.e. the 
outputs are close to zero), then these circuits will more or less faithfully reproduce the 
output of Qi and Q2. Thus, in the case that the outputs of Ci and C2 can be made close, 
we will be able to argue that the output of Qi and Q2 can also be made close. Proving 



that this picture is accurate forms the content of Section 4.5 

In the other direction, it is much simpler to argue that if there are states p, £, G D( J{] 
such that Qi(p) = QiiQ, then there are similar states for the constructed circuits Ci 
and C2. This is the content of the following proposition. 

Proposition 4.5. If there exist states p, £, such that Qi(p) = Qii^], then there exist states 
g', I' such that Ci(p') = C2[l'). 

Proof. To prove the proposition, states p' and are constructed so that 

Ci(p') = |0)(0| ® Qi(p) = |0)(0| ® Q2[l] = C2[l']. (4.7) 

To find these states, notice that both the output fidelity and construction of the circuits 
Ct do not change if additional ancillary qubits are added to the circuits Qi to allow 
purification of the input states, so long as these extra qubits are traced out at the end 
of the circuit. These purifications are pure states and all operations performed during 
the circuit Qi are unitary, which implies that the intermediate states of the circuits are 
also pure. 

If a purification of the state p is \\^), then by providing the pure state 

|y) = ® (Ui|-i|;))®2 ® [U^Mim®^ ® ■ ■ ■ ® [M^-iU^-i ■ ■ ■ Ui|il;))®' (4.8) 

as input to Ci, the output of each block of the circuit will be identical to the input to 
the next block, by construction. All but the first piece of this state is repeated twice: 
this is to provide the correct intermediate inputs that are used by the swap tests to 



compare the output of one block to the input of the next, as shown in Figure 4.6 This 



ensures that all the swap tests will succeed with probability one, which can be seen 



from Equation ( |4.3[ ). Let this constructed state p' be given by p' = |y)(yi. 
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It remains to check that on p' that Ci simulates Qi on p. By the construction of Ci 
and p', the output is exactly 

Ci(p'] = |o)(o| ® trs, u^u^-i ■ ■ ■ Uipu*u* ■ ■ ■ u;, 

which is equal to the output of Qi on p, up to a number of qubits in the |0) state, which 
correspond to the dummy qubits and the outputs of the swap tests. By the symmetry 
of the construction, a state for the circuit C2 can be constructed from £, in the same 



way, and for these constructed p' and Equation 4.7 is satisfied, which completes the 



proof. □ 

This proposition implies that the reduction presented in this section maps yes 
instances of Cli b to yes instances of Log-depth Cli b- The remaining direction is 
considerably more intricate, and forms the content of the next section. 



4.5 Correctness of the reduction 

In this section it is argued formally that the reduction presented in the previous section 
maps negative instances ( Qi, Q2) of the close images problem, i.e. those for which Qi ( p) 
and Q2(o") are far apart for all states p and a, to negative instances of log-depth close 
images. Less formally, it is shown that if the images of the original circuits Qi and Q2 
are far apart then so must be the images of the constructed circuits Ci and C2. This 
argument is technical but fairly straightforward: the basic idea is to transform the 



fidelity to the trace norm using the Fuchs-van de Graaf Inequalities (Theorem 3.22 1, 
apply the triangle inequality to reduce the problem to individual blocks of the circuits 
Ci and C2, and finally return to the fidelity with another application of the Fuchs- 
van de Graaf Inequalities. As might be expected, this technique results in poor error 
bounds: the value b, which is the maximum output fidelity allowed in a no instance of 
the close images problem, ends up polynomially close to 1. This is dealt with using a 



parallelization technique due to Kitaev and Watrous [KWOOl that improves the value 
of b to any constant b > 0. 

As the constructed circuits Ci and C2 can be used to simulate Qi and Q2, by 



Proposition 4.5 the result is obtained by arguing that either the outputs of Ci and C2 
correspond to the outputs of Qi and Q2, respectively, or the outputs of Ci and C2 are far 
apart. In the case that this simulation is not faithful it is shown that some swap test fails 
with non-negligible probability. This implies that outputs of the constructed circuits 
are far apart, as the failing swap test produces a state of the form (1 — p)|0)(0| +p|l)(l| 
that has low fidelity with the corresponding dummy zero qubit of the other circuit. 
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Lemma|4!4 which describes the behaviour of the swap test on mixed states does not 
immediately apply to the circuits Ci and C2. This is because in these circuits the output 
of one block of the circuit is not directly compared to the input to the next block, but 
instead each of these states are with probability 1/2 compared to some intermediate 
value. In order to deal with this difficulty, we use the Fuchs-van de Graaf Inequalities 



(Theorem 3.22 ) to translate the fidelity to a relation involving the trace norm, which 
we can then apply the triangle inequality to. The triangle inequality shows that at 
least one of the two swap tests fails with probability bounded below by an expression 



involving the fidelity, which is lower bounded by Lemma 4.4 In the proof of the 
following corollary the reduced states of various parts of the input to either of the 
circuits Ci or C2 are used, but no assumption is made on the form of the input state, 
i.e. it is not assumed that the input is separable across the block boundaries of the 
circuit. For instance, the density matrices pi, ct^, and £,i that appear in the lemma may 
be part of some larger entangled pure state, so that the failure probabilities of the two 
swap tests need not be independent. To clarify the notation used below, the state £,i 



is the output of the (i — l)st block (i.e. the output of Ut 1 in Figure 4.2 1, the state pi is 
the input to the ith block, and the state ffi is the intermediate state used to indirectly 
compare Pi and £,i. 

Corollary 4.6. If \\\)) is input to the circuit Ca for a G {1,2}, with Pi the reduced state of 
on "Ki ®Ai, then at least one of the swap tests on the ith block of Ca fails with probability 

at least 

^l|UiPi_iU*-pi||^ 

Proof In the ith block of Ca there are two inputs to the first swap test: let the reduced 
density operators of these inputs be Pi and Ci, as discussed above. The inputs to the 
second swap test are then given by Ci and UiPi lU? = £,i. As exactly one of these tests 
is performed we do not need to consider the effect of the first test on the state when 
considering the second test, and so the same input state 0i is used for both swap tests. 



By Lemma 4.4 the failure probability of the first and second tests, when performed, 
are at least |(1— F(pi, 0i]) and |(1— F(CTi, £,i)), respectively. Thus, the probability p that 
at least one of these tests fails, given that each of them is performed with probability 
1/2, is at least 

p ^ ^max|^(l-F(ai,£,i]),^(l-F(pi,ai))| = J(l-min{F(ai,£,i)],F(pi,ai)}]. 



By the Fuchs-van de Graaf inequalities (Theorem 3.22[ >, this fidelity may be replaced 
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by the trace norm. Doing so, we obtain 



P ^ ^max(||CTi-£,i||j^,||Pi-CTt||fj, 

where we have made use of Bernoulli's inequality to show that y/l — x ^ l — x/2 when 
simplifying this equation. Finally, as this maximum must be at least the average of the 
two values, 

/ Iki-^tlltr II Pi lltr y > J_ II p. _ ||2 

where the last inequality is the triangle inequality. □ 

By repeatedly applying some of the properties of the trace norm discussed in 
Chapter |3] it is somewhat tedious but not difficult to use Corollary 4.6 to bound the 



distance between the images of the constructed circuits in terms of the distance between 
the images of the original circuits. In the following theorem n is the size of the circuits 
Qi and Q2, as in Section 4.4[ Informally, this theorem states that "no" instances of the 



problem CIi,b are mapped to "no" instances of the problem Log-depth Cli b' with the 
resulting value b' only polynomially closer to 1 than the value b, which shows the 
QlP-completeness of log-depth close images for these large values of b. 

Theorem 4.7. 7/F(Qi(po], Q2(£,o)) < 1-cforall po,£,o G D(J{) then 

F(Ci(p),C2(f,))<l 



c2 



576n2 



for all p, £, G D( ® (g)f^2 ® -^i) 



Proof. Let p and £, be inputs to Ci and C2, and let Pi, £,i be the reduced states of these 
inputs on !Kt ® Ai for 1^1^ 2n, where the states for i > n are the inputs that are 
only used by the swap tests, which will not be referred to explicitly. That is, pt and 
£,i for 1 ^ i ^ n are the portions of the state that are input to the unitaries U| and Vi 



that make up the circuits Qi and Q2, as shown in Figure 4.2 The output of the circuits 
Ci and C2 is given by the output qubits corresponding to the swap tests as well as the 
states trs„ Pn and tr^^ £,n,, where is simply the space that is traced out to obtain 
the output from the unitary representations of the original circuits. In this notation, 
Pi and £,1 are the inputs to the first pieces Ui and Vi of the constructed circuits Ci and 
C2. These two states are density matrices in D(CKi) = D(J{), and we can also consider 
them as potential inputs to the original circuits Qi and Q2. 
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By the condition on the fidelity of Qi and Q2 in the statement of the theorem, as 
well as the Fuchs-van de Graaf inequalities (Theorem 3.22[ >, we have 



2c< ||Qi(pi)-Q2(£,i; 



Itr • 



Using the triangle inequality we can relate this to the distance between the constructed 
circuits. By adding terms and simplifying, we obtain 

2c < IIQl(pl) -trs^ Pn + tr23„ £,n - Q2(^l) +tr2^ Pn -trs,, £,n|ltr 

^ IIQl(pl) -tr2„ Pnlltr + II tr^^ £,n - Q2 ( £,1 ) || tr + II tr^n Pn " tr^^ £,n||tr- 

We now observe that Utr^^ Pn — tr^^ £,n \\^^ ^ || Ci(p) — Cil^) ||tj. by the monotonicity 



of the trace norm under quantum operations (Theorem 3.8 1, since the former can be 



obtained from the later by applying the parital trace on the appropriate space. Using 
this we have 



2c < II Qi(pi) - trs^ p^ 11^^ + lltrs,, U - Q2[^i)K + || Ci(p) - C2(£,) ||t. 



(4.9) 



As the three terms on the right are nonnegative, at least one of them must be larger 
than the average 2c/3. If || Ci(p] - C2(£,) ||t, > 2c/3 then F(Ci(p), C2(£,)] < 1 - 



by Theorem 3.22 and there is nothing left to prove. 



The cases where one of the first two terms of (4.9) exceeds 2c/3 are symmetric, and 
so we can consider only the first term. Expanding Qi (pi) in terms of the lit, we obtain 



2c 



< llQi(pi) -trs„ Pr 



2 - II -^Ll^r±v "^n Kn |ltr 

= II trs, ■ ■ ■ UiPiU^U* ■ ■ ■ U; - tra3, Pn ||tr 

^ II ■ ■ ■ UipiU^u* • • • u; - Pn lit, , 

where once again the monotonicity of the trace norm under the partial trace (Theo- 
rem 3.8 1 has been used. By adding and subtracting the term 11^ ■ ■ ■ U2P1U2 ■ ■ ■ inside 



the norm, and then applying both the triangle inequality and the unitary invariance 
of the trace norm, we have 



2c 



< llUiPiUJ - P2||t, + ||UnUn-i ■ ■ ■ U2P2U*U| 



U! 



'Ti lltr • 



Here the unitary invariance of the trace norm has been used to discard the operators 
U2, . . . U-n from the first term. Repeating this strategy, by adding and subtracting the 
term 11^ ■ ■ ■ U3P3U3 ■ ■ ■ and once again applying the triangle inequality results in 

2c 



< iiuipiut - P2||t, + 1IU2P2U* - pallt, + l|UnUn_i ■ ■ ■ U3P3u;u: ■ • • u; - p 



lltr • 
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Continuing in this fashion we have 



2c 



< _^||UiPiU*-pt+i||j^. 



As all terms in this sum are nonnegative, there must be at least one term in the sum 
that exceeds 2c/ (3n), as this is a lower bound on the average of all terms. Thus, 



for some value of t, we have ||UiPiU| — Pi+i||tj. > 2c/(3n), and so by Corollary 4.6 
one of the corresponding swap tests fails with probability p > c^/(288n^). The qubit 
representing the output value of this swap test is then of the form (1 — p) |0) (0| +p|l) (11, 



and so, by the monotonicity of the fidelity under the partial trace (Theorem 3.20 1, 



F(Ci(p),C2(y] ^ F((l -p)|0)(0| +p|l)(l|,|0)(0|] = v/T^ < 1 



c 



2 



576n2' 

as in the statement of the theorem. □ 



By combining Theorem 4.7 with Proposition 4.5 and the multiplicativity of the 



maximum output fidelity of two transformations, given as Theorem 3.24 we obtain 
the main result of this chapter. 

Corollary 4.8. Log-depth CIa,b is QIF-complete for any constants < b < a ^ 1. 



Proof. Theorem 4.7 together with Proposition 4.5 establish the result that Cli^b reduces 
to Log-depth Cli^b' for b' ^ 1 — (1 — b)^/(576rL^], where n is an upper bound on the 
size of the circuits. 



The value of b' can be improved using Theorem 3.24 of Kitaev, Shen, and Vya- 
lyi IIKSV02I , which shows that if the circuits Ci and C2 are repeated r times in parallel, 
then the maximum output fidelity is 

maxF(Cr(p), Cf (^)) = maxF(Ci(p), C2(£,))^ 
This implies that Cli b reduces to Log-depth CIi,b' for all 



576n2 

and so, by taking r polynomially large in n and b, we may take b ' ^ b, which implies 
that CIi,b reduces to Log-depth Cli^b- 

This shows that Log-depth CIi b is then QIF-complete for any constant < b < 1, 



as by Theorem |43] due to Kitaev and Watrous |KWOO| , CIa,b is QIF-complete for all 
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< b < a ^ 1. Generalizing the log depth close images problem for all values of a 
gives the problem Log-depth CIa,b for < b < a ^ 1, which is also complete for QIP 
as a it can be obtained by weakening the promise. This more general problem is in 
QIP as it is a restriction of CIa,b to log-depth circuits. □ 

As the circuits constructed by the reduction only make use of logarithmic depth 
when performing swap tests, and the controlled swap operations performed by these 
tests can be accomplished in constant depth using unbounded fan-out gates, as de- 



scribed in Proposition 2.2 the following Corollary follows immediately from the pre- 
vious one. 

Corollary 4.9. The problem Const-depth CIa,b on circuits with the unbounded fan-out gate 
is QlF-complete for any constants < b < a ^ 1. 



4.6 Conclusion 

In this chapter, the problem Close Images has been introduced. This problem asks: 
given two quantum channels, as mixed state quantum circuits, how close are the 
images of the two channels? More concretely, how large is the minimum distance of 
any two outputs of the channels, where the fidelity is used as the notion of distance. 
This problem is complete for the class QIP flKWOO I I . 

The main result of the chapter is a reduction of this problem to the case of loga- 
rithmic depth circuits. This reduction works only for the case that the two circuits 
are promised to either have intersecting images or images that are far apart, but a 
hardness result on this special case also implies the hardness of the general problem. 
This restriction is necessary to the proof that the reduction is correct as it enables the 
use of a parallel repetition technique to strengthen the promise of the class of instances 
that is shown to be hard. 

This hardness result is the base for the main result of the next chapter, which is 
that the computational problem of distinguishing quantum circuits is also QlP-hard. 
The result of this chapter enables the hardness of this distinguishability problem to be 
extended even to the case of channels implemented by log-depth circuits. 
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Chapter 5 

Distinguishability of Quantum 
Computations 



A natural problem in quantum information is to discriminate between two quantum 
channels. In the model where channels are represented as quantum circuits, this is 
the computational distinguishability problem on channels, though the difficulty of 
the problem does not change if the circuits are replaced by black boxes that can be 
performed, but not inspected. The main result of this chapter is that this problem is 
computationally very difficult, as it is complete for the class QIP of problems having 
quantum interactive proof systems. This also implies | UJUW09| that this problem is 



complete for PSPACE, which gives a new quantum characterization for a classical 
complexity class. 

The majority of the results in this chapter are in collaboration with John Wa- 



trous, and have been published in |RW05n . The results in Section 5.6 have appeared 



in IIRosOSbll 
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5.1 Overview of distinguishability problems 



The problem of distinguishing two computations is central to computer science. This 
is the problem that asks, given two computations represented in some way, do the 
two computations always produce results that are close together, or are there inputs 
on which they act differently? This is an important problem both theoretically and 
practically: determining if some process has been correctly implemented is one of the 
most important tasks in experimental quantum computing. 

The problem of estimating an unknown quantum channel is known as process 
tomography P CN97[ |PCZ97| . All known approaches for approximate process tomog- 



raphy unfortunately require exponential time. This is not a surprise, however, as the 
complete characterization of a quantum channel on n qubits requires an exponential 
number of parameters. The main result of this chapter is that even the simpler task 
of distinguishing two quantum channels, given as mixed-state circuits, is computation- 
ally intractable. That the distinguishability problem reduces to process tomography 
is clear: one way to solve the problem is characterize the two channels with enough 
accuracy to detect the case that they are far apart. 

Returning to classical complexity theory, the most basic distinguishability problem 
asks: given two classical deterministic circuits, is there an input on which they produce 
different outputs? This problem is in NP, as given such an input a verifier can both 
simulate the two circuits and check to see if they agree, all in polynomial time. This 
problem is also complete for NP. To see this, notice that a circuit is satisfiable if and 
only if it is distinguishable from the circuit that always outputs false. The satisfia- 
bility problem is the original NP-complete problem ||Coo71L and so the problem of 
distinguishing classical deterministic computations must also be NP-complete. 

Adding randomness to the circuit model, in the form of gates that produce unbiased 
coin flips, appears to increase the difficulty of the problem. Averaged over the values 
of the coin flips, the two circuits in the distinguishability problem produce probability 
distributions - distinguishing these distributions is also a nontrivial problem. To 
avoid the problem of distinguishing randomized circuits being artificially difficult, 
the additional promise is given that the two probabilistic circuits to be distinguished 
either produce output distributions that are very close for any input, or that there exists 
some input on which the distributions produced are far apart. The usual measure of 
distance on probability distributions is the difference in the ii norm, that is defined, for 
distributions p, q, as ||p — q ||^^ = Y.x \vM ~ q(^)l • This norm is simply the classical 
analogue of the trace norm of the difference of two density operators. Even when given 
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fEXP 

<QIP, (Mixed-state Quantum) 
y» QMA, (Unitary Quantum) 
NP, (Deterministic Classical) 

Figure 5.1: Complexity classes and distinguishability problems. 

an input on which they produce maximally distant output distributions, distinguishing 
two randomized circuits is complete for the class SZK of problems with statistical zero- 
knowledge proof systems |SV03n . This is in sharp contrast to the deterministic case 
where a verifier can check, given an input, whether or not two deterministic circuits 
produce the same output. The best complexity theoretic upper bound known for the 
randomized circuit distinguishability problem is AM, by simply having the prover 
first produce an optimal distinguishing input for the two circuits and then performing 
the SZK protocol due to Sahai and Vadhan |SV03| for the statistical difference problem 
that remains. This problem is not known to be complete for AM. 

Extending this problem in the direction of quantum information leads first to the 
natural problem of distinguishing unitary circuits. As in the case of randomized 
classical circuits, the outputs of unitary quantum circuits are nontrivial to compare. 
Once again, a promise is used to keep the problem from being artificially difficult. In 
this case, the distance measure can be either the diamond norm or the trace norm, 
as it is known that they agree on unitary transformations |AKN98i ICPROOj , and the 



promise is that the distance between the two given transformations, which is given by 

max ||U|il;)(i|;|U*-V|i|;)(-i|;|V*||^,, 

is either close to zero or close to two. The input in this equation may be restricted to 



pure states by Theorem 3.17 This problem has been shown to be complete for QMA 



by Janzing, Wocjan, and Beth | |JWB05| |. This result implies that, given an optimal input 



state, a verifier with access to a quantum computer can solve the distinguishability 
problem on unitary circuits. This is not unexpected, as unitary circuits are similar to 
deterministic quantum computation. 

Adding both the elements of randomness and quantumness to the computations 
being distinguished results in a significantly more difficult problem than adding just 
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one of the two elements. This is the distinguishability problem for mixed-state quan- 
tum circuits, as defined in Section 2.1 Once again we require a promise to avoid an 
artificially difficult problem: the circuits to be distinguished can be assumed to have 
a diamond norm difference that is either close to zero or close to two. It is surprising 
that this problem is QlP-complete. The class QIP is believed to be much larger than 
the classes QMA and AM. As evidence of this, the polynomial hierarchy collapses if 
QIP is contained in either of these classes. These complexity classes, known inclusions 
among them, and the distinguishability problems related to them, are summarized in 
Figure 5.1 Removing either the quantum computing, leaving randomized circuits, or 
randomness, leaving unitary quantum circuits, seems to change the essential charac- 
ter of the problem: the hardness appears to lie in the combination of both of these 
ingredients. 

The main result of this section is showing the QlP-completeness for the mixed-state 
circuit distinguishability problem using a Karp reduction from the problem Close 
Images of Chapter |4j The main technique is a scheme for using two transformations 
that are close together for some inputs and producing from them two transformations 
that act very differently on a specific input state, whereas when the scheme is applied to 
transformations that are far apart, the resulting transformations are very close together. 
In some sense this reduction inverts the distance between two circuits: circuits that 
are close together are mapped to circuits that are far apart, and vice versa, though the 
definitions of distance used in the close images and distinguishability problems are 
not the same. 



5.2 Quantum circuit distinguishability 

The problem of distinguishing mixed state quantum circuits can be stated more intu- 
itively in the following way: given a black box that is promised to implement one of 
two known quantum channels, with what probability can the channel be identified 



with only a single use of the black box? As was discussed in Section 3.5 the maximum 
probability that the correct channel can be identified is given by 

1 ^ ||(Di-(D2L 



2 4 

where the channels represent the two known channels. This implies that the 
problem of estimating the diamond norm of the difference of two channels is equivalent 
to estimating the probability that the black box can be correctly identified with a single 
use, given descriptions of the two channels Oi, ^2- 
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To obtain a computational problem, let these channels be given by mixed-state 
quantum circuits: this results in the quantum circuit disttnguishability problem that is 
the focus of this chapter. The main result of the chapter is to show that this problem is 
complete for QIP. As a by-product, the definition of this problem implies that simply 
deciding if the two channels are close together for all inputs or far apart on some 
input state is equivalent to determining with what probability the correct channel is 
identified in the black box problem. The formal definition of the distinguishability 
(promise) problem is given below. 

Problem 5.1 (Quantum Circuit Disttnguishability). For constants ^ b < a ^ 2, 
the input consists of quantum circuits Qi and Q2 that implement transformations in 
T['K, X). The promise problem is to distinguish the two cases: 



Yes: IIQ1-Q2IL > a. 
No: ||Qi-Q2|L^b. 



Less formally, this problem asks: is there an input density matrix p on which the 



circuits Qi and Q2 can be made to act differently? Theorem 3.17 implies that this 
problem can be stated in terms of pure state inputs. For notational convenience, 
this problem will be referred to as QCD^^, with the logarithmic and constant-depth 
variants referred to as Log-depth QCD^ ^ arid Const-depth QCD^ though they will 
not be encountered until Section |5!6| 

The Quantum Circuit Distinguishability problem appears on the surface to be very 
similar to the Close Images problem considered in Chapter |4| but a closer inspection 
reveals that this is not the case. Given two circuits Qi and Q2, the close images 
problem asks if there are two inputs p and o" on which the two circuits act the same, i.e. 
Qi(p) ~ Q2(cr). On the other hand, the circuit distinguishability problem asks if there 
is one input p for which the states Qi(p) and Q2(p) are nearly orthogonal. The two 
problems ask for the two channels to have significantly different properties, though 
the problems do not appear to be dual to each other in any real sense. 

In addition to this, the circuit distinguishability problem has an operational mean- 
ing in terms of how well an unknown quantum process chosen from a set of two known 
channels can be identified. An alternate characterization of the problem is, given two 
quantum channels, are they almost the same, or are there inputs on which they differ 
significantly. This is a simplification of quantum process tomography ||CN97[|PCZ97| , 



which has many applications in quantum information, but is unfortunately intractable 
in the computational sense. In contrast, the Close Images problem, as discussed in 
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Section 4.2 is quite closely related to whether or not the verifier can be made to ac- 
cept in a quantum interactive proof system. This makes the circuit distinguishability 
problem interesting for the study of the class QIP, as it gives a quantum information 
theoretic characterization of the class that is not a restatement of the definition. 



5.3 QIP protocol 

The aim of this section is to present and analyze a protocol that puts the circuit 
distinguishability problem inside of QIP. This is an essential step in showing that this 
problem is complete for this class. 

The basic idea of the protocol used to achieve this is to have the prover send a state 
on which the two circuits are maximally distinguishable, apply one of the two circuits 
at random, and then ask the prover to determine which circuit has been applied. It 
is not hard to see that by playing honestly, the prover will be able to succeed with 
probability related to the diamond norm of the difference of the two circuits. It is only 
slightly more difficult to see that this is also the optimal strategy for a dishonest prover. 
A more complete description of the protocol follows. 

Protocol 5.2 (Quantum Circuit Distinguishability). As input, both the prover P and 
the verifier V receive circuits Qi, Q2 G T(CK, X) of size at most n. The three steps of the 
protocol are: 

1. V receives from P a state p G D(CK). 

2. V chooses i G {1, 2} uniformly at random and sends Qi(p) back to P. 

3. V receives from P some j G {1, 2}, accepts if i = j, and rejects otherwise. 

The idea behind this protocol is that if the two circuits are far apart the prover 
can find an input state on which they are distinguishable. The prover is then asked 
to perform this distinguishing task. Both choosing the state p and performing the 
measurement to distinguish the output states Qi(p) and Qiip] may be computationally 
intractable and so in the protocol the prover is required to perform them: the verifier 
only needs to flip a coin and apply one of the two circuits, which can be done in 
polynomial time, given circuit descriptions. 

Step |3] of this protocol does not strictly fit into the model of quantum interactive 
proof systems. This is because the prover sends classical information to the verifier. 
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and not a quantum message. This difficulty can be avoided by allowing the prover to 
send a qubit to the verifier, who immediately measures it in the computational basis. 
Such a modification does not change the protocol, as any state sent during this step 
gives the prover no way to do better than simply sending either 1) or |2). 

To show that this protocol puts the distinguishability problem into QIP, it remains 
to show that the prover can succeed with probability p on yes instances and with prob- 
ability at most q on no instances, for some values of p, q that are at least polynomially 
far apart. Error reduction for QIP allows this to be amplified to any constant gap, as 



discussed in Section 2.2 This is the content of the following theorem. 



Theorem 5.3. For any constants a,b with ^ b < a ^ 2, QCD^ ^ ^ QIP- 



Proof. To show that the verifier of Protocol 5.2 forms a quantum interactive proof 
system for QCD^ bounds must be placed on the error probability of the protocol for 
both positive and negative instances of the problem. This will be done by showing 
that in either case, the maximum acceptance probability of the verifier is given by 

2 + 1IIQ1-Q2II0' 

which is simply the optimal probability that a black box can be identified as either Qi 
or Q2 with only a single use, as discussed in Corollary 3.16[ It is not hard to see that 



this is exactly the task faced by the prover in the protocol. 



By Theorem 3.17 there exists a Hilbert space 3" and a pure state lij)) G J{ ® 3" such 
that 

II Qi - Q2IL = II (Qi ® I:f)(I^)(M^I) - (Q2 ® lT)(l^l^)(^H)|ltr • 
For this state lip), let 

P2=(Q2®I^)(|tb)(lH). 

Let Til and 112 = 1 — Hi be projection operators on X ® J' that specify an optimal 
projective measurement for distinguishing pi from p2. These projection operators form 



the Helstrom measurement, which is discussed in Theorem |3.9[ Such a measurement 
satisfies 



trHi(pi - P2) =trn2(p2- Pi) = IIPi - P2|ltr' 



1 
2 

as Hi is the projector onto the positive eigenvalues of pi — P2 and tr(pi — P2) =0. 
A strategy for the prover that convinces the verifier to accept with probability 

2+1IIQ1-Q2II0 
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is as follows. The prover prepares the state |i|>) and sends the reduced state p = 
trgr \\\)) to the verifier, keeping the portion of the state on 5" in reserve. 

Upon receiving cr = Qi ( p) from the verifier, the prover measures the state on IK 9^ 



with the measurement {ni,T72} and returns the result to the verifier. By Theorem 3.9 
this measurement correctly determines i with probability 

11,, „ 11,,^^,, 

2 + 4l|Pl-p2|ltr = 2 + illQl-Q2||o- 

That this strategy is optimal can be argued as follows. Let £, G D(J{® 3^) be 
the state of the system immediately after the first message is sent, where the space 
3^ represents the private space of the prover (which need not be the same size as 
the space 3" considered above). The verifier applies either Qi or Q2 to the space Oi, 
which results in the global state (Qi ® l3^){Q with probability 1/2 and (Q2 ® Ig-jt^,) 
with probability 1/2. This state is, after step |2] of the protocol, in the possession of 
the prover. The prover 's final message to the verifier is immediately measured by 
the verifier, resulting in a single bit. This process may be viewed as a two-valued 
measurement on% 3^. The probability that the optimal measurement of this type is 
correct is given by Theorem 3.9[ and so 



^ + ^ II (Qi ® i^m - IQi ® i^mL <l + l\\Qi- QiW. 

is an upper bound on the success probability of the prover, as required. 

This gives a quantum interactive proof system for QCD^ that accepts yes instances 
with probability 1/2 + a/4 and accepts no instances with probability at most 1/2 + b /4. 
This proves that QCD^ ^ ^ QIP as b < a with at least a polynomial gap between them, 
by the definition of the distinguishability problem. □ 



As discussed in Section 5.1 the version of this problem defined on classical ran- 



domized circuits is contained in the complexity class AM. One way to see this is to 



consider Protocol 5^ with all of the quantum information removed. The prover can 
still send an optimal distinguishing input in Step [T] and decide which distribution the 
sample is from in Step |3| The analysis of this classical protocol is virtually identical to 
the quantum one: this generalizes a result of Sahai and Vadhan IISV03I on the statistical 
difference problem to the case of circuits that take input states. 

To complete the proof that QCD^^^ is QlP-complete the Close Images problem is 
reduced to it. The next section contains a description of this reduction. 
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5.4 Reduction from Close Images 



This section presents a reduction from the close images problem to the circuit dis- 
tinguishability problem that will be used to show that QCD^^ is QlP-hard for any 
constants a and b such that < b < a ^ 2. This is done using a standard polynomial- 
time Karp reduction: a polynomial time procedure that transforms instances of one 
problem to another that outputs a yes instance of QCD if and only if the input instance 
of CI was also a yes instance. The analysis of the reduction presented in this section 



appears in Section 5.5 



The reduction takes as input an instance of the CI problem, which is given by a 
pair (Qi, Q2) of mixed-state quantum circuits implementing channels in T(CK, X). The 
reduction produces as output a pair of circuits (Ci, C2) that form an instance of the 
QCD problem. 

As in the case of the reduction in Chapter |4| we may assume that the input circuits 
are given in Stinespring form. In this form the circuit consists of three parts. First 
is the introduction of any ancillary qubits in the |0) state, second is a unitary circuit 
applied to the input and ancillary qubits, and finally, the third part is the tracing out 
of any qubits that are not a part of the output space. A general mixed-state quantum 
circuit can be put into this form in polynomial time, and this assumption can be made 
without loss of generality that the input circuits are of this form. This is discussed in 



more detail in Section |ZT As the reduction will modify the circuits Qi and Q2, it is 
helpful to identify the names of the various Hilbert spaces associated with them. As 
mentioned above, the circuit Qt implements an operation in T(IK, X). The spaces "K 
and X will be referred to as the "input" and "output" spaces of the circuit, respectively. 
Given in Stinespring form the circuits Qi makes use of ancillary qubits. Let the space 
A represent the space these ancillary qubits are added in, and let S represent the space 
that is traced out after the unitary is applied. The space A will be called the "ancillary" 
space of Qt and S will be called the "environment" space. Furthermore, let U| be 
the unitary operation in\J{% ® A,X "B) that is applied as part of the circuit Qt. As 
we may assume without loss of generality that each circuit uses the same number of 
ancillary qubits by padding one of the circuit with ancillary qubits that are left unused 
and later traced out, we take the four Hilbert spaces "K, X, A, S to be the same for 
each of the input circuits. Notice also that since lit is unitary the spaces Oi ^ A and 
X i^'B have the same dimension, and so are isomorphic. The various Hilbert spaces 



associated with the circuit Qt are summarized in Figure 5.2 



An important piece of the reduction will be a circuit that, based on the value of 
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Figure 5.2: The circuit Qi in Stinespring form, with the Hilbert spaces labelled. 



X 



10) 



X 



Ui 



Figure 5.3: A Circuit to apply Qi or Q2 based on the value of a control qubit. 

a control qubit, applies either Qi or Q2 to the input state. Such a circuit is easily 
constructed in polynomial time by simply replacing each gate of Qi and Q2 with gates 
that are controlled by the value of the control qubit. These controlled gates need not be 
in the family of gates in the circuit model, but since the number of gates in the model 
is finite, decompositions of these controlled gates in terms of gates in the model can be 
constructed efficiently. Let, for concreteness, the circuit that implements one of the two 
input circuits implement Qi when the control qubit is |0) and Q2 when the control qubit 



is One construction for such a circuit is given in Figure 5.3 Let Q be the Hilbert 
space containing the control qubit, so that the constructed transformation is a channel 
in T(Q (g) J{, Q ® X). There is some ambiguity in the constructed transformation: the 
controlled Ui operation takes an input in Q ® IK ® !X and produces output in Q (g) 3C 3 
and this is followed by a controlled U2 operation. This operation also expects an input 
in the space Q ^ % ® X, not the space Q ® X ® "B. Fortunately both of these spaces 
have the same dimension, and so by implicitly making use of the isomorphism between 
them, this potential difficulty is avoided. 



The circuit shown in Figure 5.3 is very close to the circuits that will be the output 
of the reduction. To obtain these circuits one critical modification is made: instead 
of tracing out the environment space S, the "output" space X is traced out instead. 
This reversal of the purposes of the output and environment spaces is essential to the 
reduction. Taking a Stinespring representation of a channel and tracing out the output 
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(a) The circuit Ci 




(b) The circuit C2 

Figure 5.4: Circuits output by the reduction. 



instead of the environment leads to what has been called a conjugate or complementary 
channel IIDS051 IHol07l IKMNR07L Viewed in this light, this circuit simply applies a 
fixed conjugate of either Qi or Q2, depending on the value of a control qubit. This is 
how the circuit Ci, one of the two circuits output by the reduction is constructed. This 



circuit is demonstrated in Figure 5.4(a) 



The circuit C2 is constructed in the same way as the circuit Ci, with one difference. 
This is a Pauli Z operation is applied to the control qubit after the controlled operations. 



The circuit C2 is shown in Figure 5.4(b) This Z gate will make a substantial difference 
in the output of the two circuits when the control qubit in the space Q has not been 
decohered by the other operations of the circuit C2. The output of the reduction is an 
instance of QCD given by the pair of circuits (Ci, C2). 

The key to this reduction is that when an input is given to either Ci or C2 with 
the control qubit in a superposition of |0) and then both of the circuits Qi and 
Q2 are run. By tracing out the "output" space % the idea is that if the outputs of Ci 
and C2 are sufficiently far apart, then tracing out the space X is akin to measuring the 
control qubit but forgetting the result. Intuitively, if there is enough information in % 
to identify which of the two circuits Qi or Q2 has been performed, then the control 
qubit will be subject to decoherence. In this case the Pauli Z gate in C2 has no effect: 
the control qubit has decohered to a mixture of the form p|0)(0| + (1 — p)|l)(l|, and 
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applying Z to such a state has no effect, since 



Z(p|0)(0| + (l-p)|l)(l|)Z = pZ|0)(0|Z+(l-p)Z|l)(l|Z = p|0)(0| + (l-p]|l)(l| 

On the other hand, if the outputs of Qi and Q2 are sufficiently close, then there should 
be no information about the control qubit in the space %, so tracing it out will not have 
any effect. In this case the control qubit remains in a pure state such as (|0) + Vl, 
so that applying the Pauli Z operation in the circuit C2 results in the state 

_ v lo) + li) _ 10) - II) _ |_. 

'^^ V2 V2 ' ^' 

which is orthogonal to the control qubit output by Ci. In this way the reduction 
effectively inverts the closeness of the circuits: if the circuits Qi and Q2 can be made 
to output states that are close together, then the output of the circuits Ci and C2 can be 
made far apart. In the other direction, if the original circuits always output states that 
are distinguishable, then the constructed circuits will always be close together, as the 
control qubit is left in an incoherent mixture after tracing out the space %, so that the 
Z operation in C2 has little effect. 

This argument is not complete: the circuits Ci and C2 also output the environment 
space 'B of the original circuits, which has been ignored. The notions of closeness used 
in the problems CI and QCD are also not the same. Significant care must be taken to 
formalize this intuitive picture. This is the content of the next section, which contains 
a formal proof that this reduction implies the QlP-hardness of the QCD problem. 



5.5 Correctness of the reduction 



This section contains the formal proof that the reduction presented in Section 5.4 



implies that the QCD problem is QlP-hard. Proving that this problem is QlP-hard 



implies that it is also QlP-complete, as it is argued in Section 5.3 that these problems 
belong to QIP. 

This section is quite technical. The reader uninterested in the details of the proof 
of the main result is invited to skip the proofs of the lemmas found here: the proofs 
are not overly difficult but much of the intuition has already been presented in the 
previous section, so it is unlikely that a detailed study of these proofs will provide a 
clearer picture of the results. 
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As in the previous section, let (Qi, Q2) be the instance of CI provided as input to 
the reduction, where these circuits implement transformations in T(CK, X) given by 



Qi(p)=tr3 Ui(p®|0)(0|)U*, 
Q2(p)=tra3 U2(p®|0)(0|)U2*, 



where Ui, U2 G U( J{ ®A,%0 B). This is summarized in Figure 5.2 



From these circuits the reduction described in the previous section produces as 
output (Ci, C2), a pair of circuits that form an instance of QCD. Let, for notational 
convenience, the operator V be the operator that applies the operator Ui+i when a 
control qubit is in the |i) state, i.e. let the operation V by defined by 

V(|0) ® =|0)®Ui|i);) 
V(|l)®|x|;)) = |l)®U2|il;), 



for all states |i|>). One implementation for the operation V is given by Figure 5.3 with 



the exception that the operation V does not trace out the qubits in the space S. 

Using this notation, the circuits in Figure [534| implement the operations in the space 
T(Q® J{,Q® S) given by 

Ci(p)=trxV(p®|0)(0|)V* 
C2(p) =trxZQV(p®|0)(0|)V*ZQ. 

The operation Zq in this equation is simply shorthand for the application of the Pauli 
Z gate to the qubit represented by Q, i.e. Zq = Z ® Ig^ ® II3. This characterization 
of the circuits produced by the reduction will be used to show that Ci and C2 are 
distinguishable if and only if Qi and Q2 have close images. 

The main result of this section is that the maximum output fidelity of Qi and Q2 is 
equal to the diamond norm of the difference of Ci and C2. The proof of this is presented 
in two steps. The first, and simplest, of these steps is to show that the diamond norm 
provides a lower bound on the maximum output fidelity. This is argued directly from 
the properties of the diamond norm and the constructed circuits. 

Lemma 5.4. Given circuits Qi and Q2, and the circuits Ci and C2 constructed from them 
given by ( |5.1| > 

^||Ci-C2|L^ max F(Qi(p),Q2(cT)). 
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Proof. By Theorem 3.17 the diamond norm of the difference of two channels is achieved 
on a pure state in some larger system. Let this state be |t|i) G Q ® 'K ^ J', where 5" is 
the reference system implied by the theorem. For this state, we have 

II Ci - C2II, = II (Ci ® I:^)(|-i|;)(iH) - (C2 ® I:f)(|ip)(-iH] lltr (5.2) 

As |i];) is a unit vector, it may be written in terms of components on the two subspaces 
where the qubit in the space Q is either |0) or |1). More formally, there is some p G [0, 1] 
and states \^\>o), E Ji ^ 3^ such that 

\^\>) = v^|0)|il;o)+ yr^|l)|i|;i). 



Using this decomposition we can evaluate the circuits Ci and C2 on the state |ij>). The 
input state can be decomposed as 

p|0)(0| ® |x|;o)(-i|^ol + (1 -p)|l)(l| ® |ipi)(ipil 

+ v/p(l-p)|0)(l| ® \^\>o){A>l\ (5.3) 
+ v/p(l-p)|l)(0|®|i|^i)(il^ol. 

For the sake of brevity, further notation is introduced. Let |ct)i) = (Ui+i (8)ll3r)|\|;t). This 
notation suffices as the unitary Ui is only applied to the state |i|)o), and likewise with 
U2 with the state |\l>i). Making use of this notation, we can consider the behaviour of 
the circuits Ci and C2 on the terms in Equation ( |5.3[ ). The output of Ci is given by 

(Ci ® I:r)(|i)(j| ® \^\>i){^\>^\) - |i)(j| ® tr^ \(^i){4>^l (5.4) 
for all i, j G {0, 1}. The output of C2 differs only slightly, being given by 

(C2 ® ® |ipi)(ipjl) = (-l)'+'|i)(jl ® tr^c |ct)i)(4)j|, (5.5) 

where the (—1)^+' factor is due to the Pauli Z gate in the circuit C2. Notice that when 
i = j, as in the first two terms of Equation (53 1, the two circuits produce identical 
output. The difference between the two circuits lies in the behaviour on the final two 
terms of this equation. On these two terms the circuits agree, up to a multiplicative 
factor of —1, as can be seen from Equations ( |5.4[ > and ( |5.5| >. Using this observation, the 
difference between the outputs of the two circuits is given by 



Ci ® I:^ - C2 ® I^)(|tb)(-iH) = 2 Vp(1 - P) (|0)(1| ® tr^c \(^o){^i\ + |1)(0| ® tr^, \(\>i){(^o\) 



Combining this with Equation (5.2) yields 

II Ci - C2IL = II (Ci ® I:^ - C2 ® I:^)(|ib)(ibl) lit, 

= 2 v/p(l-p] |||0)(1| ® troc |*o)(*il + |1)(0| ® tr3<; |ct)i)((l)ol lit, ■ 
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From this equation, as well as Lemmas 3.10 and 3.21 we find that 

l|Ci-C2lL=^ 



p 


1 


-p) 


p 


1 


-p) 


p 


1 


-p) 



4 Vp(1 - p) F(tr2®3^ |cl)o)(4)oltra3g)5- I4)i)(cl)i|) 



^2maxF(Qi(p),Q2(a)), 

p,cr 

which completes the proof of the lemma. □ 
The following lemma formalizes the intuitive picture presented in Section 5.4 that 



the constructed circuits Ci and C2 are distinguishable if the original circuits Qi and Q2 
have output states with high fidelity. This is the second direction of the proof of the 
main result of this chapter. 

Lemma 5.5. Given circuits Qi and Q2, and the circuits Ci and C2 constructed from them 
given by ( |5.1| ), 

^||Ci-C2|L^ max F(Qi(p),Q2(cT)). 

Proof. Let pi, P2 G D(J{) be two arbitrary states. For these states, we will show that 

||Ci-C2|L^2F(Qi(pi),Q2(p2)]. 

Let the states |i|)o), e 'K®J"be purifications of pi and P2, where 3" is any Hilbert 
space large enough to admit these purifications. These states will play a similar role 



to the states of the same name used in the proof of Lemma 5.4 Following the notation 
in this lemma further, let = (Ui+i ® lgr)|ij;t) be the states produced by applying 
the "appropriate" unitary to these states. Using this notation, consider the input state 
e Q ® 'K to Ci and C2 given by 

W = ^10)1^1^0) + ^11)1^1^1). 



On this state the output of the two channels is exactly as discussed in Lemma 5.4 with 
p = 1/2. In particular, the channel Ci produces the output 

{Ci®l:,m)m = ^ Y_ |i)(jl®trxl4)t)(clJj| (5.6) 

i,je{0,i} 

while the circuit C2 produces the output 

{C2^l:rm)m = l Y. (-l)'+'|i)(3l®trxl4)i)(cj)j|. (5.7) 

ije{04} 
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These equations follow from identical reasoning to the derivation of equations (5.4 1 



S (g) 3" are purifications of Qi(pi) and 



and ( |5.5[ ) of the previous lemma. 

Notice that the pure states Icjjo), Ict^i) G 3C 
Q2(p2)/ respectively. This allows us to use Lemma 3.21 to transform the trace norm of 
tr^c l<t'o)(4'il into the fidelity of the input states pi and P2 to the original circuits. This 
trace norm will be essential, as the difference between the outputs of the circuits Ct 



consists only of terms of this form, which can be seen from Equations (5.6) and (5.7|. 



This, along with Theorem 3.14 and Lemma 3.10 show that 



Ci 



C2|L^ \\{C^^l:,-C2^l:,){\A>){M)lr 

= |||0)(1| ® trx |ct)o)(cl)i| + |1)(0| ®tr3c|ct)i)(4)ol||t, 
= 2||tr3c|(j)o)(ct)i|||t^ 
= 2F(Qi(pi],Q2(p2)). 



This completes the proof of the lemma. 



□ 



With these two lemmas, most of the work in proving the main result has been 
completed. We have so far shown that the diamond norm difference of the constructed 
instance (Ci,C2) of QCD and the maximum output fidelity of the input instance 
(Qi/ Q2) of CI are equal. This fact is stated as the following theorem for easy reference. 
This also proves that the reduction correctly produces "yes" instances of QCD if and 
only if it is given "yes" instances of CI. 

Theorem 5.6. Given circuits Qi and Q2, and the circuits Ci and C2 constructed from them 
given by ( |5.1| >, 



C1-C2 



211 —1 "2 o 
p,aeD(:H; 



max F(Qi(p),Q2(a)). 



Proof. Lemma 5.4 provides the upper bound and Lemma 5.5| provides the lower bound. 
Taken together they prove the desired equation. □ 

This theorem immediately implies the main result of the chapter: the QIP hardness 
of the distinguishability problem for mixed-state quantum circuits. 

Corollary 5.7. For any < b < a ^ 2 i/ze -problem QCD^^ is QlP-complete. 



Proof. Theorem 5.6 and the construction in Section 5.4 imply the reduction 

CIaAb/2 QCD,,b- 
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As CIa/2,b /2 is QlP-hard for any 0<b<a^2by Theorem 4.3 which is due to Kitaev 
and Watrous [KWOO | | , this reduction implies that the distinguishability problem is 
QlP-hard for the values of a, b specified in the theorem. 

The distinguishability problem is also complete for QIP for these values of a and 
b, as it is in QIP by Theorem 5.3 □ 



5.6 Distinguishing log-depth computations 

In this section it is discussed how to extend the hardness of the QCD problem to 
the case of input circuits that have logarithmic depth. This can be done by simply 



noting that the reduction of Section 5.4 can be modified to produce output circuits of 
logarithmic depth, and the hardness will follow from the hardness of the log-depth 
close images problem. 

To see that the QlP-hardness of the problem Log-depth ClQ/2,b /2 can be extended 



to the problem Log-depth QCD^^, observe that the reduction in Section 5.4 simply 
takes the input circuits an produces circuits that apply them based on the value of a 
control qubit. These controlled operations can be implemented in logarithmic depth 
using a tree structure with copies of the control qubit made in the computational 



basis - this is discussed in Proposition 2.1 If this more careful implementation of the 



controlled Ui and U2 operations is made, then the output circuits of Figure 5.4 have 
logarithmic depth if and only if the input circuits do. This requires that the circuits for 
the operations Ui that implement the input circuits Qi by the equation 

Qi(p)=trs Ui(p®|0)(0|)Ut 

can be assumed to have logarithmic depth when the mixed-state circuits Qi do, but 
these circuits can be constructed by simply delaying any partial trace operations that 
are performed during the circuit. These circuits have the same depth as the original 
mixed state circuits and they can be constructed in polynomial time. This implies that 

Log-depth CIa/2,b /2 Log-depth QCD^^ i,, 

which, by Corollary |4.8| immediately implies the following corollary, since Log-depth 
QCD is in QIP by Theorem |5.3| as it is a restriction of the general problem. 

Corollary 5.8. For any < b < a ^ 2 i/ze problem Log-depth QCD^ i, is QlP-complete. 

As in Chapter |4| the only place that this construction requires logarithmic depth 
circuits are the controlled operations. If the unbounded fan-out gate is allowed into 
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the basis of computational gates, then the circuits can be reduced to constant depth, as 
discussed in Proposition 2.2 This implies the following result. 



Corollary 5.9. For any Q <h < a^2,the problem Const-depth QCD^^^ on circuits with 
the unbounded fan-out gate is QlP-complete. 



5.7 Conclusion 

In this chapter, the problem Quantum Circuit Distinguishability has been introduced 
and studied. This is the problem of determining if two quantum channels, given 
as mixed-state quantum circuits, is there an input on which they produce nearly 
orthogonal output states, or are they effectively the same on all inputs? 

The main result of the chapter is that this problem is complete for the class QIP 
of problems that have quantum interactive proof systems, when the phrases "nearly 
orthogonal" and "effectively the same" are formalized as large and small diamond 
norm distance, respectively. This result requires many of the results on the diamond 
norm from Chapter|3| such as Theorem 3.17 which proves that the diamond norm of 



the difference of two channels is achieved on a pure state input. This result can also 
be extended to the case of channels specified by logarithmic depth circuits, or even 
constant depth circuits if the unbounded fan-out gate is included in the circuit model. 

The main result of this chapter is essential for the result in the next two chapters 
that this distinguishability problem remains hard when restricted to circuits that im- 
plement convex mixtures of unitary channels and when restricted to the degradable 
or antidegradable channels. These results will be shown by reducing the problem 
considered here to restricted versions. 
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Chapter 6 

Degradable and Antidegradable 
Channels 

The degradable and antidegradable channels are two of the most interesting classes 
of quantum channels. The degradable channels are, informally, the channels where 
the output space contains more information about the input than the environment, in 
the sense that the output state can be used to reconstruct the state of the environment. 
The antidegradable channels can be similarly thought of as those channels whose 
output contains less information about the input than the environment does. These 
channels have many nice properties when considered for the transmission of quantum 
information. This is interesting as these channels can be otherwise awkward to work 
with: as an example, the set of degradable channels is not even convex! 

The main result of this chapter is that the quantum circuit distinguishability prob- 
lem considered in Chapter |5] remains QlP-complete on both the degradable and an- 
tidegradable channels. This lends evidence to the notion that the difficulty of distin- 
guishing quantum channels has little to do with how well they preserve information. 
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6.1 Degradable and antidegradable channels 



As defined in Chapter [T| a channel O G T['K,%) is degradable if there exists a second 
channel A that maps the output state of O to the state of the environment. More 
precisely, if 

(I)(p) =trsU(p®|0)(0|)U*, 
then O is called degradable if there exists a channel A G T[X,'B) such that 

(Ao(D)(p) =tr3cU(p®|0)(0|)U* =(DC(p). 

The channel O*^ is called the complementary channel to O, and it is only defined up 
to an isometry, since it depends on the Stinespring representation of This does not 
affect the notion of degradability, however, as this isometry can be viewed as a part of 
to the degrading map A. Loosely, these are the channels whose output contains more 
information than the environment, because the output can be degraded to give the 



state of the environment. These channels were introduced by Shor and Devetak P DS05I 



to study the capacity of a channel for transmitting quantum information. Notice that 
the set of degradable channels is not convex: any unitary channel is degradable, but the 
completely depolarizing channel is not, and it can be written as a convex combination 



of unitary channels (see Proposition 7.2 1 



A channel is called antidegradable if the complementary channel is degradable. 
Alternately, a channel is O antidegradable is there exists a map A such that A o O = O, 
where once again the channel O is only defined up to an isometry, but this isometry 
can also be part of the map A, so that the antidegradable channels are also well- 



defined. This class of channels has been introduced by Wolf and Perez-Garcia IIWPG07I , 
and can be informally thought of the class of those very noisy channels that lose 
more information to the environment than they preserve in the output. A thorough 
discussion of the degradable and antidegradable channels can be found in |CRS08| , 
where it is shown that, unlike the degradable channels, the set of antidegradable 
channels is convex. 

The degradable and antidegradable channels are very interesting from a quantum 
information perspective. A simple no-cloning argument implies that the antidegrad- 
able channels have zero capacity for the transmission of quantum information. This ar- 
gument was first presented for erasure channels in |BDS97|| , extended to lossy bosonic 



channels in IIGLMS03 I, and finally applied to antidegradable channels in IIGF05I . It is 



also known that the coherent information is additive on degradable channels, which 
implies that the quantum capacity is given by the coherent information of a single 
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use of the channel, i.e. that the formula for the quantum capacity does not require 
regularization |DS05| . 

As the degradable and antidegradable channels have nice properties with respect 
to the transmission of quantum information, it might be hoped that similar properties 
extend to the transmission of classical information. In the case of the Holevo (or 
X-)capacity, it is shown in |CRS08| that the additivity of this quantity on degradable 
channels is equivalent to the general case, making use of a result from BFW07I . As it 
is also known that this additivity problem is equivalent on the complementary class 
of channels ||Hol07l IKMNROZL this implies that the additivity of the antidegradable 
channels is also equivalent to the general case. Finally, using the recent result of 
Hastings PHas09| , there are degradable and antidegradable channels that do not have 
additive Holevo capacity. 

Interestingly, we can adapt the same construction used by Cubitt, Ruskai, and 
Smith |CRS08[ to show that the quantum circuit distinguishability problem restricted 
to either the degradable or antidegradable channels remains QlP-complete. These 
results are the focus of the remainder of this chapter. 



6.2 Simulation by a degradable channel 

Given a circuit Q implementing a transformation in T( J{, X), the goal is to efficiently 
construct a circuit C implementing a degradable channel in T{'K,%] that is closely 
related to the original circuit Q. This reduction will make use of the results used in 
the case of the minimum output entropy IICRS08II : the construction presented here, as 
well as the proof that the resulting channel is degradable, can both be found in this 
work. 

To describe the channel, we assume that dim J{ = dim %, i.e. that the circuit Q has 
identical input and output dimension. This may be assumed without loss of generality 
by padding the smaller space with unused |0) qubits, since these qubits will not affect 
the diamond norm used in the definition of the distinguishability problem. Once this 
padding has been completed, we may view Q as an implementation of some channel 
in T([K, !K). The channel C constructed from Q will make use of an additional output 
qubit in the space C of dimension 2, so that Q G T( J{, C ® JC). 

The basic idea is to implement the channel 

C(p) = ^|0)(0|®p + ^|l)(l|®Q(p). (6.1) 
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Figure 6.1: The degradable channel C constructed from Q. 

This is just the channel that applies the circuit Q with probability 1/2, and does nothing 
to the input with probability 1 /2. If Q is given in Stinespring form with unitary U, so 
that 

Q(p)=tra3U(p®|0)(0|]U*, 
then the channel C can be implemented as shown in Figure 6.1 The idea in this 



implementation is that the top ancillary qubit (which is the qubit in the space C) is 
placed in the |+) state, which results in the circuit for Q being applied with probability 
one-half, as the value of this qubit is 'copied' onto one of the environment qubits by 



the controlled-not gate. This results in the mixture in Equation (6.1 ). 

To see that the circuit C implements a degradable operation, the degrading map 
that takes the output state to the environment state can be explicitly constructed. As the 
complementary channel is defined only up to an isometry, we may construct this map 
for any of the complementary channels defined by C, as this isometry can be added 
to the degrading map as required. For this reason, we consider the complementary 
channel defined by the implementation in Figure [6J| i.e. the channel from the input to 
all those qubits that are traced out. This results in the complementary channel 



where Q has implementation 



^|0)(01®|0)(0| + ^|l)(ll®Q^(p], 



(6.2) 



QC(p)=tr:KU(p®|0)(0|)U*, 

which is obtained by tracing out the 'output' space of the original circuit. 

It is not hard to see how to implement the degrading map Ac for this channel. 
Starting with the output state of C 

C(p) = ^|0)(0|®p + ^|l)(l|®Q(p), 
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Figure 6.2: The degrading channel Ac corresponding to the channel in C in Figure 6.1 



as given by Equation ( |6.1| >, this channel can, based on the flag state in the space C 



either output |0)(0| or Q^(p). More formally, when the flag state is |0), the state in 
"K is the original input p, so that the channel can be applied to it by performing 
the unitary U from the circuit Q and tracing out the appropriate space. On the other 
hand, when this flag state is |1), the degrading map needs to output |0)(0|, which can 
be done by producing the correct number of untouched ancillary qubits as output. 
All that remains in to invert the flag qubit to get exactly the output of C". A circuit 



implementation of the channel Ac is presented in Figure ro^ We can formally verify 



that this map performs the required operation by observing that 

Ac(C(p)] = ^Ac (|0)(0| ® p + |1)(1| ® Q(p)) 

= ^|1)(1|®Q''(P] + ^|0)(0|®|0)(0| 
= C^(P), 



where the final equality is Equation ( 6.2[ |. This argument, due to Cubitt, Ruskai, and 



Smith BCRS08II proves that the channel C is degradable. In the next section we consider 
the implications of this construction for the computational hardness of the problem of 
distinguishing quantum circuits. 



6.3 Distinguishing degradable channels 

The construction in the previous section essentially embeds any channel into a degrad- 
able channel. This construction can be used to show that distinguishing degradable 
channels is no easier than distinguishing general channels. 

As a first step towards this, a formal definition of the circuit distinguishability 
problem of Chapter |5] is presented. This is simply the general problem with the extra 
restriction that the input circuits implement channels that are degradable. 
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Problem 6.1 (Degradable Quantum Circuit Distinguishability). For constants ^ b < 
a ^ 2, the input consists of quantum circuits Ci and C2 that implement degradable 
transformations in T(IK, %). The promise problem is to distinguish the two cases: 

Yes: II Ci - C2||^ ^ a. 
No: II Ci - C2||^ ^ b. 

The primary ingredient in the proof that this problem is QlP-complete, is the result 
that the construction in the previous section does not significantly affect the diamond 
norm of the difference of two channels. This is not difficult to see from the output of 



the construction, given by Equation ( |6.1[ ), but for completeness it is argued formally in 
the following lemma. 

Lemma 6.2. Let Qi, Q2 be quantum circuits implementing transformations in T{0-C,X]. If 
Ci, C2 G T(^, e ® X) are given by 

Ct(p)-^|0)(0|®p + ^|l)(l|®Qt(p), 

for i G {1, 2}, then 

||Ci-C2|L = ^||Qi-Q2lL. 
Proof Let p e D(IK ® ff") be an arbitrary state. Then 

II (Ci ® I:r - C2 ® It)(p) lit, = ^ II |0)(0| ® (p - p) + |1)(1| ® ([Qi ® I^r - Q2 ® 1:^1 (p)) ||tr 

= \ llll)(l| ® ([Qi ® I? - Q2® (P))|lt, 
1 

= 2lKQi®i^-Q2®i:^)(p)lltr- 

Since the diamond norm is defined as the maximization over all states p, this implies 
the statement of the lemma. □ 

Let (Qi, Q2) be an instance of QCD^ 1,. As it is demonstrated in the previous section 
how to efficiently construct the channels Ct from the channels Qi, this lemma implies 
the following reduction 



QCD^^ ^TTT^ Degradable QCD^/2,b 



/2- 



This implies that distinguishing degradable circuits is hard for all < b < a ^ 1, using 
the hardness result for general circuits (Corollary 5.9[ >. 



122 



This result can be strengthened using a result from Section 3.7 on the polarization 
of the diamond norm. The general construction does not preserve degradability, but 
for the special case of interest using only a portion of the polarization construction 
will suffice. The strategy is to take an instance (Ci,C2) of Degradable QCD^ ^ and 
construct the instance (Cf^, Cf^). This second instance will have outputs that are 
more distinguishable, for the simple reason that there are more copies of the states to 
be distinguished available. This will send the norm for 'yes' instances of the problem 
from 1 to a value close to 2, but it will also have the property that the norm of 'no' 



instances is not made too large. This is a straightforward consequence of Lemma 3.25 
which appears as part of the procedure for polarizing the diamond norm. 

Corollary 6.3. For any constants < b < a < 2, the problem Degradable QCD^ ^ is 
QlF-complete. 

Proof. This problem is in QIP as it is a restriction of the general problem, which is 



in QIP by Theorem 5.3 To see that it is QlP-hard, take an instance (Qi, Q2) of the 



QlP-complete problem QCD2 for e > a constant. 



Applying the construction of Section 6.2 to (Qi, Q2) results in the instance (Ci, C2] 



of Degradable QCD^ e/2/ by Lemma 6.2 As the degradable channels are closed under 



tensor products, [Cf^, Cf^) is a pair of circuits implementing degradable channels. 



By Lemma [3.25 we have the following implications 



Ci-C2|L^1 =^ iicr-crii ^2-2 



Ci-C2L^? =^ ||cr-c?NL 



1 p(8)l< 
1 ^1 


^2 1 


1 pgik 
1 ^1 


pg)k 1 
^2 1 



-k/8 



10 ^ 2 M i ^ Mo ^ 2 

These equations imply that for any constants < b < a < 2, there are choices of the 
constants k, e so that 

> a, 

which implies the QIP hardness of Degradable QCD^ t, □ 



Qi 


-Q2lL = 2 = 


, ||p(8)k 
^ II ^1 


p®k 1 
1 


Qi 


-Q2|lo^e = 


. ||pg)k 
^ II ^1 


pg)k 1 
^2 1 



6.4 Simulation by an antidegradable channel 



In this section a construction very similar to that used in Section 6.2 is presented 
that takes any circuit Q to a circuit C implementing an antidegradable channel. The 
idea is to (with probability one-half) send the input state to the environment, so that 
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Figure 6.3: The antidegradable channel C constructed from Q. 



the channel that maps the environment state to the output state will have a copy of 
the input state. This construction (and the proof that it produces an antidegradable 
channel) is very similar to a construction used in [CRSOSn for degradable channels. 

Once again we may assume that Q implements a channel in T(!K, i.e. that Q 
has the same input and output dimension, by embedding the smaller space into the 



larger, if necessary. As in Section 6.2 the constructed circuit C will use one additional 
output qubit, implementing an antidegradable transformation in T(CK, C ® "K). 

Let Q implement the transformation given by 

Q(p)=tra3U(p®|0)(0|)U*, 

where, as usual, since Q is assumed (without loss of generality) to be in Stinespring 
form, the input specifies a circuit for computing the unitary U. The channel C will be 
constructed as 

C(p) = ^|0)(0| ® |0)(0| + ^|1)(1| ® Q(p). (6.3) 

This is just the channel that applies Q with probability one-half, outputs |0) with 
probability one-half, and outputs a flag qubit in the space C to indicate which case has 



occurred. In a way very similar to the construction in Section 6.2 this channel can 
be implemented using a controlled-U operation. In this case, however, we will also 
need the operation W that swaps the states in two spaces (i.e. W|a)|b) = |b)|a)). An 



implementation of the channel C is given in Figure 6.3 This circuit will, depending 
on the value of the control qubit in the space C either apply Q or output the pure state 
|0), as required. 

To show that the circuit C implements an antidegradable channel, we explicitly 
construct the map Ac that maps the environment state of C to the output state. The 
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Figure 6.4: The anti-degrading channel corresponding to the channel in C in Figure 6.3 



environment state of C is once again simply the state produced by C*-, the comple- 
mentary channel to C. As before, this channel is only defined up to an isometry, but 
this is not significant as this isometry can be absorbed into the definition of Ac. One 
implementation of the channel C*" is obtained by considering the channel mapping 



the input of C to the space traced out by the circuit in Figure 6.3 This channel is given 
by 

C^(p] = ^I0)(0| ® p + ® Q^(p), (6.4) 

where once again the channel Q is given by 

QC(p)=tr:KU(p®|0)(0|)U*. 

Given the state in Equation ( |6.4| > it is not hard to see how to map it to the state in 



Equation ( |6.3| >. This can be done by implementing one of two operations, depending 



on the value of the flag qubit in the space C, which is the 'copy' of the control qubit 



traced out in Figure 6.3 If this qubit is in the state |0), then the other portion of the 
input state is p, the original input to C, so that applying the circuit for Q produces the 
state Q(p). If the control qubit is in the |1) state, however, the remainder of the input 
state is Q^(p). This state can be discarded (i.e. traced out) and ancillary qubits in the 
state |0) can be used as the output, using the swap operation W. As before, the value 
of the qubit in C needs to be flipped with a Pauli X gate so that the state is exactly 



correct. A circuit implementing this is shown in Figure 6.4 



To see that Ac correctly implements the anti-degrading map for C, we may compute 

Ac(C^(p]] = ^Ac (|0)(0| ® p + |1)(1| ® Q^(p)) 

= ^|l)(l|®Q(p) + ^|0)(0|®|0)(0| 
= C(p), 



125 



where the final equality is Equation 6.3 This demonstrates that the channel C con- 
structed from Q is antidegradable. In the following section the implications of this 
construction for the hardness of computationally distinguishing antidegradable chan- 
nels is considered. 



6.5 Distinguishing antidegradable channels 

In a very similar way to the degradable case, the construction in the previous section 
embeds any channel into an antidegradable one. In exactly the same manner as 
Section |6.3[ this can be used to show the hardness of distinguishing circuits that 
implement antidegradable transformations. 

As in the degradable case, the distinguishability problem in the antidegradable 
case is simply the restriction of the problem to a smaller class of channels. 

Problem 6.4 (Antidegradable Quantum Circuit Distinguishability). For constants ^ 
b < a ^ 2, the input consists of quantum circuits Ci and C2 that implement an- 
tidegradable transformations mT['K,%). The promise problem is to distinguish the 
two cases: 

Yes: II Ci - C2II0 > a. 
No: II Ci - C2||^ ^ b. 

Once again the key technique to proving that the problem is QlP-complete is to 
place bounds on the diamond norm of the difference of two channels that have had 
the construction of the previous section applied to them. The proof of this lemma is 



identical to the proof of Lemma 6.2 



Lemma 6.5. Let Qi, Q2 be quantum circuits implementing transformations in T{0-C,X). If 
Ci, C2 G T(:K, e ® X] be given by 

Ci(p] = ^|0)(0| ® |0)(0| + ® Qt(p), 

for 1 G {1, 2}, then 

l|Ci-C2|L = ^||Qi-Q2|lo- 

Proof Let p G D(J{ O ^) be arbitrary Then 

1 

II (Ci ® I:r - C2 ® I:r)(p) ||tr = 2 II (Qi ® I:? - Q2 ® (p) 



as in the proof of Lemma 6.2 This implies the statement of the lemma. □ 
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Exactly as in the degradable case, this implies the QlP-hardness of distinguishing 
antidegradable channels for constants < b < a ^ 1. Once again, this can be 
strengthened to any constants < b < a < 2 using the polarization techniques of 



Section 3.7 using the property that the antidegradable channels are closed under 



tensor products. The next corollary follows from Lemma ro3 in exactly the same 



manner that Corollary 6.3 follows from Lemma p2\ so the proof has been omitted. 



Corollary 6.6. For any constants < h < a < 2, the problem Antidegradable QCD^ t, is 
QlP-complete. 



6.6 Conclusion 

This chapter has presented a construction for embedding an arbitrary channel into 
a degradable channel due to Cubitt, Ruskai, and Smith |CRS08| , as well as a closely 



related construction for antidegradable channels. These constructions can be effi- 
ciently implemented on quantum circuits, so that instances of the quantum circuit 
distinguishability problem can be mapped to degradable or antidegradable channels. 

The main result of the chapter is that the distinguishability problem on quantum 
circuits remains hard when restricted to either the class of degradable channels or the 
class of antidegradable channels. The proof of this result makes use of the diamond 



norm polarization techniques of Section 3.7 
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Chapter 7 

Mixed-Unitary Channels 



The mixed-unitary channels are an interesting class of quantum operations. These 
are the channels that probabilistically apply one of a set of unitary operations. These 
channels have several interesting properties and many of the common transformations 
used in quantum information are mixed-unitary. For these reasons the problems of 
determining the additivity of the classical capacity of a mixed-unitary channel and 
distinguishing circuits that implement mixed-unitary operations are important steps 
toward understanding these problems. 

In the distinguishability case it is shown that distinguishing mixed-unitary channels 
is exactly as computationally difficult as general channels, using a reduction that 
essentially simulates a general channel with a mixed unitary one. In the case of 
additivity, a similar reduction is used to show that given a channel, there is a mixed- 
unitary channel that is approximately additive if and only if the original channel is 
additive. By sending the approximation error to zero this produces a sequence of 
mixed-unitary channels with the property that the original channel is additive if and 
only if the tail of the sequence consists of additive mixed-unitary channels. 



The results in this chapter have been published in IIRosOSaP . 
Contents 
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7.4 Properties of the constructed channel 11421 

7.5 Multiplicativity of mixed-unitary transformations 11451 

7.6 Mixed-unitaries and minimum output entropy 11471 

7.7 Circuit constructions 11491 

7.8 QlP-completeness of distinguishing mixed-unitary circuits 11561 

7.9 Conclusion UMl 



7.1 Mixed-unitary channels 

As defined in Chapter [l| a quantum channel O is mixed-unitary if there exist unitary 
operators Ui, ... , 11^ and a probability distribution pi, . . . , Pn such that 

n 

a)(x) = }^PtU,xut. (7.1) 

i=l 

These channels have many interesting properties. These channels are commonly 
known as the random unitary channels, but they will be referred to as the mixed- 
unitary channels here to avoid confusion with unitary operators drawn chosen at 
random according to some measure. This notational choice was suggested by Watrous 
in IIWat09all . 



It has been shown by Gregoratti and Werner IGWOSj that the mixed-unitary chan- 
nels describe exactly the noise processes that can be corrected using classical informa- 
tion obtained by measuring the environment. One way to see that this correction is 



possible is to consider a Stinespring representation for the channel in Equation ( 7.1) . 



One such representation can be constructed using the operations V and W given by 

V|x|;)|i) = (Ut|i^)]|i), 
W|0) = Y_ v^l^)- 

i 

The operation V is a unitary operation m.\J{'}i ® A,% ® 'B) and the operator W can be 
extended to a unitary operation in \][A] in an arbitrary way. A Stinespring represen- 
tation for O is then given by 

cD(p) = trs V(I:k ® W) (p ® |0)(0|) [l^ ® W*]V*, 

where the operator W prepares a weighted superposition of the ancillary space, the 
operator V applies the corresponding unitary operator from Equation p.l) , and finally 
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the partial trace over S produces the desired mixture. To see that this can be perfectly 
reversed with a measurement of S, notice that when applied to the state p, before the 
partial trace the system is in the state 

^ = Y. v/pvpy(UipUp®|i)(j|. 

Measuring the second system in the computational basis gives an outcome a with 
probability Pa, leaving the system in the state 



Pa 

The original state can then be recovered by simply applying the selected by the 
outcome of the measurement. This example describes in principle any mixed-unitary 
channel, due to the uniqueness of the Stinespring representation, up to an isometry on 
the space !B which corresponds to a different measurement. The fact that the mixed- 
unitary channels are the only channels that can be corrected using the strategy of 
measuring the environment and applying a correction is more complicated and can be 



found in BGWOSP . 



One question that this correction scheme raises is how much classical information 
must be recovered from the environment to correct a mixed-unitary channel? This 



corresponds to minimizing the number of operators Ut in Equation (7.1). A simple 
bound on this quantity is given by Buscemi ||Bus06l , who shows that the number n of 
unitary operators in Equation ( 7.1| > is at most the square of the minimum number of 



Kraus operators in a Kraus representation of O. 

Audenaert and Scheel have also provided a characterization of the mixed-unitary 
channels lASOSI , and used it to construct a measure of the distance from a quantum 



channel to the set of mixed-unitary channels. 

The remainder of this chapter provides an answer to the question: are the prob- 
lems of the additivity of the classical capacity and the distinguishability of quantum 
channels simplified when restricted to mixed-unitary channels? This is answered 
in the negative, using a method to approximate an arbitrary quantum channel by a 
mixed-unitary one. This approximation will only faithfully implement the channel 
on low-entropy outputs, as the technique used will be able to decide when the ap- 
proximation would fail and instead produce a highly mixed state. This suffices to 
produce a channel with the same minimum output entropy or maximum output p- 
norm, however, as these quantities are defined only by the low-entropy outputs of the 
channel. 
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These results extends the work of Fukuda HFukOZI on unital channels to the mixed- 



unitary case. The unital case is discussed in the next section. In the Section 7.3 the 
mixed-unitary case is described. Section 7.4 proves some properties of the construction 
that will be used in Sections 7.5 and 7.6 that consider the multiplicativity of the p- 
norm and the additivity of the minimum output entropy (respectively). An efficient 
circuit construction for this reduction is then presented in Section [7!7[ which is used in 



Section 7.8 to reduce the circuit distinguishability problem from general channels to 
the mixed-unitary channels. 



7.2 Unital channels 

Recall from Chapter |l] that a channel O G T['K, X] is doubly stochastic if cD(1:k) = Ix, 
and unital if it is also the case that "K = %. This section provides an overview of a 
result related to the main results of the chapter. This result is Fukuda's proof that 
the additivity of the minimum output entropy or the multiplicativity of the maximum 
output p-norm of an arbitrary channel O is equivalent to the same problem on a related 
unital channel O' IIFuk07L 

The unital channels can be characterized in a similar way to the mixed-unitary 
channels. Mendl and Wolf have shown that any unital channel O can be represented 
in the form 

(D(p)=^A|UipUt, 

i 

where the lit are unitary operators and YLi^i — 1/ with Ai G IR for all 1 IIMW09I . 
It is also of note that for channels on qubits the mixed-unitary channels are exactly 
the unital channels l|Tre86i IKM87I . This is no longer true for channels on systems 



of larger dimension, as shown by Landau and Streater PLS93| . An interesting fact is 
that unital but not mixed-unitary channel that they use is, up to unitary conjugation, 
the same channel used by Werner and Holevo to find the first example of the super- 
multiplicativity of the p-norm of a quantum channel IIWH02II . This might suggest that 
these channels are the key to this property, but the results of this chapter imply that 
the mixed-unitary channels do not hold a special place with respect to this property of 
super-multiplicativity. Indeed, Hayden and Winter IIHW08I have shown that mixed- 
unitary channels also exhibit this property. 

The key ingredient in Fukuda's reduction to the unital case is the addition of an 
extra input system that allows for an input determined selection of one of the discrete 
Weyl operators Wtj introduced in Chapter[l]to be applied to the output of the channel. 
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Letting O G T( J{, %], with d = dim X, the channel is constructed as 



(D'(p®|i,j)(i,j|) =Wy(D(p)W*j, (7.2) 

for any 1 ^ i, j ^ d. This defines a channel O ' on T( ® X ® IK, 3C) by linearity Such a 
channel can be implemented by measuring the input space % 0%m the computational 
basis to decide which of the Weyl operators to apply and then tracing out the result. 
This is the same construction used by Shor to prove that the additivity of the minimum 
output entropy of a channel O implies the additivity of the Holevo x-capacity of the 
channel <!>' IISho04l . This construction was discussed in Section [3. 3.1 



To see that this channel is doubly stochastic, notice that on any input of the form 

1 



d2 



The fact that this mixture of discrete Weyl operators mixes states in this way is shown in 



Proposition 7.2 as part of the proof that this operation is mixed-unitary. This equation 
implies that, on the particular input t^^x(g)X the output of is given by 1%, as 
required. This channel is not unital, but it can be made so by adding an additional 
output space of the correct dimension in which the output state is always completely 
mixed. This extra mixed state will affect the minimum output entropy or the maximum 
output p-norm by a constant depending on d, and so it will have no effect on additivity 
or multiplicativity. 

Fukuda proves the following result about this construction. 
Theorem 7.1 (Fukuda IFuk07l ). Let O e T(J{,3C),¥ e TiX,"^) and let cD' he the doubly 



stochastic channel constructed from <I) as in Equation ( |7.2[ ). For these channels, and any 
p G [1, oo], 

(D||p = II^^O'llp. 

Proof. Only a proof of the minimum output entropy case is provided, as the proof for 
the maximum output p-norm is identical, with the concavity of the entropy replaced 
by the triangle inequality. 

To see that S^inl^ ® O) ^ Smin(V notice that since Wo,o = 1, if the input in 

the space that controls the Weyl operations is given as |0, 0), then 

(¥®O')(p®|0,0)(0,0|) =l3c(^®(I>)(p)lx = (^®(D)(p), 
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from which the desired inequality follows immediately. 

In the other direction, notice that since the channel <1> ' can be assumed to immedi- 
ately measure the control input, the channel W O' can be written as the probabilistic 
application of one of the discrete Weyl operators to the output of ¥ C?) O. To this end, 
let p be an input minimizing S ( ® O ') ( p) ) and let the result of measuring the control 
input in the computational basis be |i, j) with probability Pi j, and let Pij be the state 
after the measurement has produced outcome i, j. In this notation we have 

S^in(V®0')=S((¥®(D')(p)) 

= S ( J^Pt,j(l®Wy)(¥®0)(p,j)(l®W*j 

^ _^P^JS((l®Wy)(¥®(D](py)(l®W^)) 

where the concavity of the entropy and the unitary invariance of the entropy have 
been used. □ 



In Sections 7.6 and 7.5 similar results are shown for the mixed-unitary channels, 
though the techniques used to prove them do not seem to be directly related to 
Fukuda's construction in the unital case. 



7.3 Mixed-unitary approximation 

Given a representation of a channel O in Stinespring form, that is, an implementation 
of the form 

(D(X) =tra3U(|0)(0|®X)U*, (7.3) 

there are only two operations that are not mixed-unitary. These are the partial trace 
over the system S and the introduction of the ancillary system in the state |0). The 
goal of this section is to describe a method for approximating these two operations 
with mixed-unitaries, so that when combined with the circuit for the operation U in 



Equation (7.3 1, the result is a mixed-unitary approximation of <1>. 



Though this approximation does have an efficient circuit implementation, the dis- 



cussion of mixed-state quantum circuits is postponed to Section 7.7 as this allows the 
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Figure 7.1: The channel O to be approximated by a mixed-unitary, in Stinespring form 
with labelled spaces. 

construction to be described in simpler terms. Additionally, some of the applications 
of this simulation technique do not depend on efficient circuit implementations, so this 
simplified exposition is useful for readers not interested in computational complexity. 
Despite this avoidance of the quantum circuit model, the figures in this will use cir- 
cuit diagrams, but this is done only for clarity: no assumptions are made regarding 
implementations of the channels depicted. 

To describe this approximation we fix notation throughout the next three sections. 
To this end let O G T['K,%]he a quantum channel, and let a Stinespring representation 
for O be as given in Equation 7.3[ In this representation let A be the space containing 
the ancillary space starting in the |0) state, and let S be the space that is traced out. 
This implies that the operator U is a unitary map from A Oi to X "B. These spaces 



are summarized in Figure 7.1 



7.3.1 Simulating the partial trace 



Of the two operations in Equation ( |7.3[ ) that are not mixed-unitary, the partial trace is 
the easiest to simulate with a mixed-unitary channel. At an intuitive level, the partial 
trace represents the loss of information to the environment in a quantum channel, but 
this operation is not mixed-unitary as it changes the dimension of the system being 
considered. The direct approach to simulating this with a mixed-unitary is to model 
the loss of information with a completely depolarizing channel, which avoids the issue 
of the change in dimensionality. It is not hard to prove that this approach works, nor 
is it hard to see that the completely depolarizing channel is mixed-unitary. 

This depolarizing channel can be implemented as a mixture of the discrete Weyl 
operators, which are also known as the generalized Pauli operators H AMTd WOO [ IBR031 



IHLSW041I . These unitary operators, as discussed inChapter[T| form an orthogonal basis 
for the space L{A) of linear operators on a Hilbert space A of dimension d. 
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Proposition 7.2. The completely depolarizing channel on A has implementation as a mixed- 
unitary channel given by 



a,b=0 



Proof. Let p G D(yi) be a density matrix and let d = dimyi. By Equation 1.7 the 
operators Wa,b form a basis of L(yi], so that p can be decomposed as 

d 

P= Y. Kf^e,f' (7.4) 

e,f=l 

for some coefficients Ae,f G C. Notice also that since yVa,h has trace zero unless 
a = b = 1 and Wo,o = lyi/ it is the case that 

trp 1 
Ao,o — , ^ — 7- 

Putting this decomposition into the proposed implementation, we obtain 

a,b Q,b,e,f 

Using Equation 1 1 ■ 6 1 and the unitarity of the discrete Weyl operators to manipulate this 
sum gives 

a,b,e,i e,f \ a,b / 

Since cu is a primitive dth root of unity, this inner summation is zero unless e — i — 0, 
and so we have shown that 

- ^ d-1 ^ d-1 

^ X Wa,bPW:^b = -Ao,o Y ^O'O = ^ L = = ^(P^' 

a,b Q,b=0 a,b=0 

as required. □ 

This proves that the channel N 3 that completely depolarizes the space S can be 
implemented as a mixed-unitary channel. To see that this channel can be used to 
replace the partial trace observe that one implementation of this channel simply traces 
out the state in S and replaces it with the state that has been separately prepared. 
From this implementation it is clear that for a state p G D (yi ® S ) it holds that 

N2(p) = (trs p] ® is, (7.5) 
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and this property will hold no matter how N ^ is implemented. This implies that if the 
system to be traced out instead has N-b applied to it, the resulting state is the same, up 
to a tensor factor of a maximally mixed state in the space S. By replacing the partial 
trace over 3 in Equation |7.3| with this channel, the result is 

Ns(U(10)(0|®X)U*) =(D(X)®ia3. 

This is the best that can be hoped for, as a mixed-unitary transformation cannot change 
the dimension of the system it acts on. 

7.3.2 Simulating the ancillary space 

Replacing the introduction of the ancillary space A with a mixed-unitary operation 
is more complicated than replacing the partial trace. In order to do this the input 
space of the transformation is to expanded to include the space A. The input state 
of this system will not, in general, be the desired state |0), so additional operations 
are needed to ensure that this is the case for any input state that either maximizes 
distinguishability or minimizes the output entropy of the resulting channel. 

Because these quantities of interest, the minimum output entropy and the maxi- 
mum output p-norm, involve optimizing over input states, the channel can be con- 
structed so that those inputs that achieve the optimal value have the desired property: 
the input state in the space A is (close to) the state |0). Given this property, the val- 
ues of these optimizations will stay approximately the same when taken over the 
mixed-unitary simulations of the original channels. 

To this end, the ideal operation A to ensure this condition does not alter any input 
state of the form |0) (0| ® ct, but takes any orthogonal state to the completely mixed state 
^A^'K- This operation is, unfortunately, not mixed-unitary, as it is not unital, since 

= di^l°><°l ® 1« + (l - j^) (7.6) 

Notice, however, that this channel deviates from unitality with additive error 1 / dim A: 
there is a very good unital, and indeed mixed-unitary, approximation to this ideal 
channel, which is described in the remainder of this section. 

This closely related mixed-unitary channel first projects the input state either onto 
the subspace So = |0) ® or the orthogonal subspace Sq = 10)-*- ® Oi. This projection 
is then be followed by a completely depolarizing channel on the subspace Sq. These 
operations can be implemented using mixed-unitary channels, and the distance from 
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the ideal channel will go as 0(1/ dim yi), which allows the error to be made arbitrarily 
small by padding A with an unused ancillary space. 

The mixing process on Sq is be introduced first. It is given by the channel M 
that does not affect the subspace So but completely depolarizes the space Sq. More 
concretely, on a state p = qpsg + (1 — c|)ps± where pso = |0)(0| (g) cr is a density operator 
on So and p^± a density operator on Sq, the output of M is given by 

M(p) = qM(psJ + (1 - q)M(psx) = qpso + (1 - qjls^^ 

= q|0)(0| ® a+ (1 - <^)^^^^ ® i:K. (7.7) 

Here notation has been abused somewhat: So and Sq are Hilbert spaces, but the whole 
space 71 (g) J{ is not the tensor product of these two spaces. 

The channel M can be implemented as a mixed-unitary channel in the same way as 
the completely depolarizing channel: a uniform mixture of the discrete Weyl operators, 
except here these operators are taken over the subspace Sq. These operators exist, and 



the whole construction is very similar to the one given in Proposition 7.2 More 
concretely, where Wa,b for <i/b G are the discrete Weyl operators on the space Sq 
and Iso is the identity on the space Sq, the channel M can be implemented as 

d-l 



d2 

a,h=0 



where all of the operators Iso © ^a,h are mixed-unitary by construction. 

As previously mentioned, this channel does not implement the ideal transforma- 



tion. If the output of M on in Equation ( |7.7[ ) were the completely mixed state on 
yi ® J{ and not the subspace Sq then this process would create an essentially error-free 
mixed-unitary approximation of the original channel (for the purpose of minimizing 
the output entropy or maximizing distinguishability). Fortunately, the error involved 
at this step can be shown, in Lemma to be 0(1/ dim 71), which, by Equation 7.6 



is as close as a mixed-unitary channel can come to the ideal case. Fortunately this 
error can be made arbitrarily small by taking the space A large enough, and so this 
construction can be used to approximate the ideal case. 

It will be helpful for the analysis of this construction to remove the coherences be- 
tween the subspaces So and Sg-. The channel M does perform this operation. This is the 
operation commonly known as dephasing that, applied to a density matrix expressed 
in some basis, removes the off-diagonal terms. This aids the analysis of the construc- 
tion, because once this dephasing is applied, an equation similar to Equation ( |7.7| > 
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would hold for all input states p, not just those states that have no entanglement 
between the subspaces So and Sq. 

While we will only need to apply dephasing between the two subspaces So and 
Sq, a mixed-unitary construction for the general case is provided below. This is the 
channel D that completely decoheres all information not stored in the computational 
basis. More specifically, this channel implements 

, if 1 = i, 

D(|i)(j|)=6,j|i)(j| (7.8) 

otherwise. 

That this channel is mixed-unitary is simple to prove, using a construction based on the 
discrete Weyl operators, similar to that used for the complete depolarizing channel in 



Proposition 7.2 That this channel can be implemented in this way has been observed 
in IIDFH06L 



Proposition 7.3. The completely dephasing channel on A defined in Equation (7.8) has 
implementation as a mixed-unitary channel given hy 



1 '^"^ 

D(p) = -^Wo,bpW*, 



b=0 



Proof. Recall that Wo,b |j) = |j) = cu^' |j) as introduced in ChapterjTj where cu is a dth 
primitive root of unity, with d = dim A. To see that this channel has the desired effect, 
let p = (3 1' so that 



d-1 ^ d-1 d-1 ^ d-1 d-1 



1 ^ Wo,.pWo% = ^ L L ayZ^|i)(jlZ-^ = \^I- awa;(^-''^|i)(j|. (7.9) 

b=0 b=Oi,j=0 b=Oi,j=0 

Then, since cu is a dth root of unity 



t-^ ,. d if i = j, 

t,=o otherwise. 



Combining this property with Equation (7.91 gives 



^ d-1 ^ d-1 d-1 

b=0 i=0 i=0 

which is exactly the channel defined by Equation ( 7.8| ). □ 
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For the specific case that we will use here, a simpler construction suffices: instead of 
applying this dephasing channel to the whole of the ancillary space A it only needs to 
be applied to remove coherences between the two orthogonal subspaces So and Sg-. If 
these two subspaces are viewed as a two-dimensional Hilbert space, the construction 



in Proposition 73^ can be reduced to the application of a specific unitary operator V 



with probability one-half. The action of this unitary V on basis states is given by 

(li) if |i) G Sn, 
' ^ ' ^ (7.10) 
if!i)GSo^. 

In other words, V applies a phase of —1 to states in Sfj- and does not change states in Sq. 
When V is applied with probability one half the result is complete dephasing between 



the two subspaces. This can be seen by restricting the construction in Proposition 7.3 
to the case of a two-dimensional system with orthogonal states that represent the 
subspaces So and . More concretely, when this is applied to a density matrix 
expressed in the computational basis, the result is, by a simple calculation, the zeroing 
of the off-diagonal elements of the first row and column. Let this simplified dephasing 
channel be given by 

Dso(p) = ^[VpV* + p], 

where the operator V is given in Equation ( |7.10| >. When this operation is applied to a 
density operator p G D(yi (g) 'K], the result is 

Dso(p) = qpso + (i-q)pSo^ = q|o)(oi®cT + (i-q)ps^^, (7.11) 

where pso = |0) (0| cr is a density operator on the subspace So = |0) ® IK, pgx is a 
density operator on the orthogonal subspace S^, and ^ q ^ 1 is a probability. 

Combining Equations ( |7.7| > and ( |7.11| >, the output of Ds^ followed by M. on a density 



operator p on /l (g) K is given by a state of the form 

(M o DsJ(p) = qM(|0)(0| ® a) + (1 - q)M(ps 



= q|0)(0| ® CT+ (1 - q)i^JM ^ i^. 

dim yi — 1 

This operation, M o Dsq, will be used as a way to force any input that results in a 
low output entropy to be close to the subspace So of inputs having the 'ancilla' space 
A in the desired |0) state. On these inputs the constructed mixed-unitary channel 
will behave in a similar way to the original channel that is being approximated. On 
inputs that are far from this subspace, the resulting state has high entropy, and so it 
will not be close to a state minimizing the output entropy and it will not be useful for 
distinguishing two channels constructed in this way. 
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7.3.3 Mixed-unitary approximation of a general channel 



Putting these pieces together, given a channel 0(p) 
unitary approximation O ' is constructed as 



trs U(p ® |0)(0|)U*, the mixed- 



(D'(p)=N3(U[(MoDso)(p]]U* 



(7.12) 



which, more plainly, is simply the application of the ancilla simulation procedure of 



Section 7.3.2 the unitary operation from a Stinesprtng dilation of O, and finally the 
completely mixing channel to the space that would have been traced out by O, as dis- 



cussed in Section 7.3.1 As the mixed-unitary channels are closed under composition. 



the channel O ' is mixed-unitary. 

It will be useful to observe that the constructed channel O' specified in Equa- 



tion (7.12) can be used to simulate the original channel O. This occurs when the input 
|0) (0| C?) (T, i.e. an input in the space So, is provided to (I> This is argued in the following 
proposition. 



Proposition 7.4. Let O e T{:K,X). IfO'e T{A^:K,%- 
that is constructed from O in Equation ( |7.12[ ), then 



B) is the mixed-unitary channel 



(D'(|0)(0| ® ff) = 0(CT)®i2. 

Proof. Notice that both Dsg and M do not affect this input: the decoherence operation 
does not affect the state as it is in the subspace So and M does not affect the state 



by Equation ( |7.7[ >. Thus, the output of the channel O' is 



(D'(|0)(0| ® a) = (U [(M o Dsj(|0)(0| ® a]] U* 
= N3(U(|0)(0|®ct)U*) 
-trs (U(|0)(0|®(T)U*)®is 
= (D(a] ® is. 



where the penultimate equality is an application of Equation ( |7.5| >. 



□ 



Combining this proposition with Equation ( 7.11 ) that demonstrates the effect of the 
M o on states not of this form, and the observation that applying M o Dsg twice 
has no further effect than applying it once, the output of O ' on an arbitrary input state 
p is given by 



O'ip) = pO'(|0)(0| ® (T) + (1 -p](D'(ps J = pO(a) ® is + (1 - p]0'(ps J, (7.13) 
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where as in Equation ( |7.11[ > Ps± is a density operator on the subspace of inputs 
orthogonal to those with the state |0) on the space A. The most significant portion 
of the technical results in the next section lies in bounding the distance from the 
maximally mixed state of the second term in this equation, from which most of the 
results will follow straightforwardly 



7.4 Properties of the constructed channel 

This section provides the basis for the analysis of the mixed-unitary approximation 
constructed in the previous section. The main result is a lower bound on the output 
entropy when the constructed channel is applied to a state in S^, the subspace of 
inputs where the 'ancillary' subspace A is not in the desired |0) state. This result is not 
difficult to show, but it will be essential to the results that follow. 

Throughout this section, and the two sections that follow, the channel O will 
represent an arbitrary transformation, and <1> ' will represent the mixed-unitary trans- 
formation constructed from it, as in Equation ( 7.12| >. The names of the Hilbert spaces 



that acts on will be consistent with the previous section: O maps mixed states on 
"K to X, using the ancillary system A and tracing out the system S. The constructed 
channel is mixed-unitary, mapping density matrices on A 7{ to % 1^ 'B. 

As a first step to showing that O ' approximates O it is shown that mixed-unitary 
channels do not increase the distance of a state from the completely mixed state. This 
lemma can be interpreted as the statement that the output of a mixed-unitary channel 
is not more pure than the input. The Hilbert space 25 appearing in this lemma will 
correspond to a reference system needed for the results in Section 7.8 - this generality 
will not be needed for the results on the maximum output p-norm or the minimum 
output entropy. 

Lemma 7.5. Let \\\-\\\ be a unitarily invariant norm on L(yi eg) !B). If^ E T[A) is mixed- 
unitary, then for any p G D(yi (g) S) 

|||(^®Is)(p)-iyi®tryip||| ^ |||p-iyi®tryip||| 

Proof. As ¥ is mixed-unitary, let ¥(X] = ^ . PiU^XUt with the unitary, ^ pi ^ 1, 
and Y-iVi = 1- For brevity, let U| = lit (g) for all i. Using this notation 

|||(¥®Ia3)(p)-iyi®tryip||| = |||_^ptUipUt -i^® tropin 

i 

^ _^p,|||UtpUt-i^®tryip|||. (7.14) 
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Notice that U4yiU* = 1^, which implies that UdlA ® ct)U* = iyi (g) a. Using this fact. 



as well as the unitary invariance of the norm. Equation (7.141 becomes 



^Pi|||U|(p-iyi®tryip)U. Ill = )~ Pi| I I p - iyt O tryt p I I I = I I I p - iyt (g) tryt p I 



Combining this with Equation (7.14) yields the statement of the lemma. □ 



This lemma will be used to show not only that the ancilla simulation procedure 
sends states in the subspace Sq of states where the ancillary space is not in the |0) state 
to states that are highly mixed, but that the channel O ' also has this behaviour. Before 
doing this, however, the lemma is extended to the case of the von Neumann entropy, 
where the proof is essentially identical, with the exception that the triangle inequality 
is replaced by concavity. 

Corollary 7.6. IfW G T[A) is mixed-unitary, and p G D(^), then 

s(p)^smp)). 

Proof. Let ^(p) = Y.i PiUtpU? as in the proof of Lemma 7.5 Using this notation, and 
the concavity of the von Neumann entropy 

Smp)) - S l^^ptUtpUt j ^ ^p,S(UtpUt) = ^ptS(p) = S(p), 

where the penultimate equality is due to the unitary invariance of the entropy. □ 

The next lemma shows that when the input is in the subspace Sq the output of 0' 
is very close to completely mixed. The distance measure used is the trace norm, but 
this can be applied also to the case of the maximum output p-norm due to the fact that 
II P lltr = II P 111 ^ II P lip P ^ [1/ This is the key lemma in the proof of the results 

on the additivity and multiplicativity conjectures, though it is not difficult to prove. 



Lemma 7.7. On input states p G Sg the output of the channel given in Equation ( 7.12 1 
satisfies 

ll*'(P)-WL<g^. 

Proof On input p G the operation Ds^ that introduces decoherence between the 
subspaces So and Sq has no effect. This implies that the output of M o Dsq on p is 
obtained by setting q = in Equation ( 7.7[ >, which is 
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Setting d — dim A, the distance from from the completely mixed state on A ' 

d 



:Kis 



d-1 



1 





l.i-d|0)(0| 




tr 


d(d-l] 


tr 



d-1 



d-1 



2 
d' 



(7.16) 



d(d-l) d(d-i; 

Finally, by noting that the remainder of the transformation O ' is mixed-unitary and 
(implicitly) using the isomorphism A^Oi = %0'B, an application of Lemma 7.5 
yields the desired bound. □ 

Once again we can extend this result to the case of the entropy. The previous 
lemma on the trace norm can be extended to the case of the entropy in the standard 
way: using Fannes' inequality l|Fan73l (see also IINCOOII ), but given the characterization 
in Equation ( |7.15| >, a better bound can be obtained by explicitly computing the entropy. 
This bound will require that dim/l ^ 2, but this can be assumed without loss of 
generality by adding an unused ancillary space. 

Corollary 7.8. Let O' be given as in Equation ( |7.12| >, and let dim A ^ 2 and p G Sq, then 

1 



dim/l 



Proof. In the proof of Lemma 7.7 the output of MoDsq on p is given by Equation (7.151, 
which states that 

M(D,,p), ^-'"XOI 



1 



■K- 



dim A — 1 

Letting d = dim A, this state has (d — l)(dim!K) eigenvalues and each with value 
equal to l/((d — l)(dim!K)). Using this observation, the entropy of this state can be 
computed as 

5/1^ -10) (01 



\ dim yi — 1 



= S (1^55 jc) -log (^1 + 
For d ^ 2, the last term has Taylor expansion given by 



I^k] =log((d-l)(dim:K)] 

log dim 'K + log dim A + log 

1 



d-1 



d-1 



(7.17) 



log 1 + 



1 



+ 



1 



d-iy loge[d-l 2(d-l)2 3(d-l)3 



log, 2 



1 



[d-l)loge d' 

Combining this with Equation (7.171 gives the lower bound on the entropy provided 
in the statement of the corollary, for the state after ancilla simulation procedure. By 
Corollary 7^6] applying the remainder of O ' to this state cannot decrease the entropy, 
as this portion of O ' is mixed-unitary. □ 
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This corollary and Lemma 7.7 show that when the input state to O' does not have 
large overlap with a state in So, the output state is highly mixed. This property will be 
used to show that any state that maximizes the output norm or minimizes the output 
entropy will have large overlap with the space So of states of the form |0) (0| cr, where 
the simulation of the original channel is faithful. 



7.5 Multiplicativity of mixed-unitary transformations 



In this section the construction of Section 7.3 is used to show that the maximum output 



p-norm of a channel is multiplicative if and only if the mixed-unitary approximations 
to it are also multiplicative. This will be done for all 1 ^ p < oo, using the analysis of 
the previous section. 

This property is not difficult to show once it has been established that the mixed- 



unitary channel O' constructed from O in Equation (7.12) is a good approximation 



with respect to the p-norm. This is the content of the following theorem. 

Theorem 7.9. IfOe T{'K,X), then the mixed-unitary cD' e T[A ®'K,%®'B) satisfies 

^v[^') 2dimS 
^p(^) ^ ^^p^^^ + ^T^^- 

II 



Proof. For convenience, let d = dim/l. The first inequality is simple: Proposition 7.4 
shows that 

(D'(|0)(0|® p) = 0(p)®iB, 
from which it follows immediately that 

as "Vp is a maximization over input states, and || ■ ||p is multiplicative with respect to the 
tensor product of two states. 

To prove the second inequality let p G D(yi (g) "K] be a state such that 

A/p(0') = l|0'(p)||p. (7.18) 
Such a state exists by the compactness oiT>{A®'K]. The output of the channel O' on 



p is given by Equation (7.13>, applying the triangle inequality to this yields 



|0'(p]||p = ||qO(a)®is + (l-q)(D'(Psx)||p ^ q|| 0(0) ® ||p + (1 - q)|| O'(psx) H^. 
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Lemma 7!7|provides a bound on the second term of this equation, which implies that 

||(D'(p)||p ^ q||(D(a)®is||^ + (l-q) (^|| Ijc^b ||p + ^ 

Then, as the norm || • is multiplicative with respect to the tensor product of states, 
and Iliac II ^ ll^llp for any state £, G D[X), 



|(D'(p)|l ^ q||(D(ff)|| Jlisll +(l-q) ||i 



,, 2\ ~ 2 

' d / " ^ iipii -° lip 



Then, by the choice of the input p in Equation (7.18 1, we have shown that 



(7.19) 



Finally, the state has dim S eigenvalues, each with value 1/ dim !B, which implies 
that 

/dims 1 \'^^ 1 

lli^ll = y — i— =dimSi/^-^^— ^. 



Combining this with Equation (7.19), and expanding d = dim/l, implies 



1 



Ip II ^llp 

which completes the proof of the second inequality 



2 2dimS 

— — ^ "Vp((I>] H , 

l^R dim A dimA 



□ 



With this approximation result, the main theorem on the maximum output p-norm 
can be shown. This extends the construction of Section [7!2] due to Fukuda IIFuk07l on 



unital channels to the mixed-unitary unitary case, using essentially the same method 
of proof. 

Theorem 7.10. J/(D,¥ G T[%,X) andv G [l,oo], then 



if 



^p((D:i®^)=^p((D;iK(n 



for all sufficiently large d, where is the mixed-unitary approximation of the channel O 



obtained by applying the construction of Section 73 to a Stinespring dilation of O using a 
d-dimensional ancillary space. 
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Proof. As adding ancillary space to O' increases both dirndl and dimB, by taking 
d = dirndl large enough is can be assumed that dimS ^ 2d.. Let e > 0, and choose d 
so that 4/d^/^ < e. This, along with the choice of dim S ^ 2d implies that 



2dim3^-i/P/d ^ 4/d^/T' < e 
Then, as a)^(|0)(0| (g) p] = 0(p) by Proposition[ 



(7.20) 



7.4 



il -DMp 

By assumption, this second quantity is multiplicative, so that 



t 



2 



Applying Theorem 7.9 to this quantity shows that 

4 



dVP 



where the final inequality is by the choice of d to satisfy Equation ( |7.20[ ). As epsilon 
was chosen arbitrarily, the multiplicativity of "Vp(O^) for all large enough d implies 
the multiplicativity of "Vp ( O ] . □ 

This theorem shows that in order to show the multiplicativity of "Vp on a class of 
channels it suffices to consider a related class of mixed-unitary channels. This problem 
may be more tractable for channels of this type: many of the known counterexamples 
to multiplicativity for small values of p are mixed-unitary HHWOSi . 



7.6 Mixed-unitaries and minimum output entropy 

The results of the previous section on the multiplicativity of the maximum output 
p-norm can be extended directly to the additivity of the minimum output entropy. 
This is done using very similar proof techniques as in the previous section. 

The following theorem demonstrates that the mixed-unitary O' constructed in 



Equation (7.12 ) forms a good approximation of the original channel <1>, from which the 



result on the additivity will follow directly. 

Theorem 7.11. IfOe then the mixed-unitary O' G T{A ^ 3i,X ® "B] satisfies 

Smin(O) ^ S^in(O') -logdimCB ^ S^^iO] - 

dim>A 
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Proof. Exactly as in the case of Theorem 7.9 Proposition 7.4 implies the first inequality. 



as O' on a particular state can be used to simulate O: 

(D'(|0)(0|®p) = 0(p)®is. 

Let p be a state minimizing S((I>'(p)) and for convenience let 6 = 1/ dim/l. Equa- 



tion (7.131 gives the output of O' on p. Applying the concavity of the entropy (Propo- 



sition |33]) to this, we obtain 

S^n(O') = S(0'(p)) ^ qS(0(a) + q)S(cD'(ps,)). 



Applying Corollary 7.8 this becomes 



Smin((D') > qS(cD(0) ® is) + (1 - q]{S{iA<sji) - 5). 

Notice that, since <!>' is mixed-unitary, A^Ji is isomorphic to 3C ® B. This implies 
that S(iyi®:K) = S{tx»ii]- 



Two additional properties of the entropy introduced in Section 3.1 will be useful: 



the additivity of the entropy on states (Equation ( |3.2[ )), S(cr ® £,) = S(cr) + S(£,), for 
any cr, £,; and the fact that the entropy is maximized on completely mixed states, 
S(£,] ^ log dim = S(ix) for all £, G V{X) (Proposition |3^. Using these three 
observations, in order, we find that 

Smm(O') ^ qS((D((T) ® is) + (1 - q)(S(i3c®3) - 6) 

= q(S((D((T)) + Slis)) + (1 - q)(S(i3<:) + S[U] - 6) 
^ q(S(cD(0)) + S(is)) + (1 - q)(S(0(a)) + S{U) - 6) 
^ S(0(cT)) + S(ia3)-6. 

Finally, since S[t-s) — log dim S and Smm(*I') ^ S ((!)(£,)) for any £,, we have 

Snun(<I>') ^ S^i„(0) +logdimS - 5, 

which completes the proof of the theorem. □ 

The proof that the additivity of the minimum output entropy can be equivalently 
restricted to mixed-unitary channels follows from the previous theorem in a way that 



is identical to the proof of Theorem 7.10 with the exception that the p-norm has been 
replaced by the minimum output entropy. The method of proof here follows Fukuda's 
result for unital channels |Fuk07| . 
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Theorem 7.12. IfO,W e T(J{,3C), then 
if 

^min 

for all sufficiently large d, where zs the mixed-unitary extension of the channel obtained by 



applying the construction of Section 7.3 to Stinespring dilation for O using an ancillary space 
of dimension d. 

Proof Let e > 0, and choose d large enough so that so that 1/d < e. Then, as 

(D;i(|0)(0|® p] = 0(p)®ia3, 

S^i„(0®¥) ^ S^in(0^®^)- log dims. 
By assumption, this second quantity is additive, so that 

Smm(0®¥) ^ SMn(0:i®¥)- log dims 

= S^in(0;i) + S^nW -log dims 
d 



where the penultimate inequality is an application of Theorem 7.11[ As e was chosen 



arbitrarily, the additivity of for all large enough d implies the additivity of O. □ 

This theorem implies that in order to prove the additivity of the minimum output 
entropy for a class of channels, the hopefully simpler class of mixed-unitary approxi- 
mations can instead be considered. This may be a fruitful approach: the only channels 
for which Smin is known not to be additive are mixed-unitary |Has09| - this property 
may be simpler to check for mixed-unitaries having certain properties. 



7.7 Circuit constructions 

In this section an efficient circuit construction is provided for the mixed-unitary ap- 



proximation described in Section 7.3 This construction is used to extend the hardness 
of computationally distinguishing quantum circuits to the case of mixed-unitary cir- 
cuits. 
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Before constructing these circuits, it will be important to specify the circuit models 
that are being used. The circuit model used to define the quantum circuit distin- 
guishability problem is the mixed state quantum circuit model of Aharonov, Kitaev, and 
Nisan IIAKN98I , which is described in Section 2.1 As previously discussed, we may 
assume that circuits in this model first introduce any necessary ancillary qubits, then 
perform a unitary operation, and finally trace out those qubits that are not part of the 
output. This approach is equivalent to building a circuit for the Stinespring dilation of 
a channel. As all unitary transformations can be (approximately) implemented using 
one and two qubit gates there is no loss in generality in assuming that the unitary 
transformations implemented in such a circuit are composed of gates from some finite 
basis of one and two qubit gates. 

The second model of quantum circuits we consider is the model of mixed-unitary 
quantum circuits. These circuits consist of one and two qubit gates from the usual circuit 
model as well as mixed-unitary gates, which implement a unitary gate with probability 
one half. More formally, the application of such a gate performs the operation 

^UpU* + ^p, 

where U is a one or two qubit unitary gate in the standard gate set. 

For technical reasons, we need to assume that the Pauli X and Z gates, as well as 
controlled versions of these gates, are part of the standard basis. This restriction can 
be avoided by allowing gates that implement 

lu^...U2UipU*U*---U* +^p, 

where the lit are gates of the standard model. This allows sequences of multiple gates, 
such as approximations to gates not in the basis, to be applied with probability one 
half. When proving a hardness result, the model should be as restricted as possible, 
and it is not clear that this model is not more powerful than the model where each 
mixed-unitary gate is applied with an independent probability of one half. 

The model of mixed-unitary circuits is an extremely simple model that does not 
appear to be universal for the class of transformations that implement mixed-unitary 
operations. It is not clear that this is the correct definition of the mixed-unitary circuit 
model, but since the aim of the model to prove a hardness result, an extremely weak 
definition has been chosen so that the result will apply to as large a class of circuit 
models as possible. 

One drawback of this weak model is that the exact construction used in Section [73] 
cannot be implemented. Specifically, the operation D that decoheres the subspaces So 
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and Sq seems to require a unitary operation that cannot be decomposed into a series of 
one and two qubit gates, applied with probability one half. A similar situation occurs 
for the implementation of the completely depolarizing channel on the subspace Sq, 
the implementation of which uses the discrete Weyl operators on the subspace Sq. 
These operations can be implemented in a mixed-unitary way in a more permissive 
circuit model, but in order to keep the circuit model as simple as possible, a modified 
construction is presented here. This modified construction is built from pieces that 



perform similar tasks to those used in Section 7.3 but the specific building blocks 



are not exactly the same. The construction in this section can also be applied to the 



additivity and multiplicativity problems considered in Sections 7.5 andu^ but it is 



somewhat more complicated than the construction already presented. 

In order to approximate a given circuit with a mixed-unitary circuit we once again 
make use of three main components, which are once again referred to as N, D, and 
M. These pieces are labelled in this way due to the fact that they play the same roles 



as the components of the same names used in Section 7.3 though the details differ 
slightly. The first two of these components, N the completely depolarizing channel 
and D the completely dephasing channel, are easy to implement as mixed-unitary 
operations in the chosen circuit model. More difficult to implement is the channel M, 



which performs a function similar to the channel described by Equation (7.7). 



The complete dephasing channel D is the channel that sets to zero all of the off- 
diagonal elements of a density matrix. More formally, the action of this operator 
applied to the space A, for an input p on 71 eg) J{ is given by 



dimyi-l 



Dyi(p)= p,|i)(i|®Pi, (7.21) 



i=0 



where the pi form a probability distribution. This operation is equivalent to measuring 
the space A in the computational basis and forgetting the result. This channel is 



shown to be mixed-unitary in Proposition 7.3 where it is implemented as a mixture 



of generalized Pauli Z operations. To implement this as a mixed-unitary circuit, 
observe that restricting the construction of Proposition |7.3| to the case where A is a 
two dimensional space results in exactly the channel that applies a Pauli Z gate with 
probability one-half. Notice also that applying this channel to each of n qubits is 
identical to applying the completely dephasing channel to the whole space. Thus, 
the operation Dji that applies D to the space A can be implemented as a mixed- 
unitary circuit by applying the Pauli Z operation to each qubit of A independently 
with probability 1/2. This construction can be found in | |CY97| . 
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The completely noisy channel N is also simple to implement as a mixed-unitary 
circuit. This channel can be realized on a single qubit by performing a uniform 



mixture of the Pauli operators on each qubit, which is a consequence of Proposition 7.2 
when restricted to the case of a single qubit. This mixture can be implemented by, 
independently on each qubit, applying the Pauli Z operation with probability 1/2, 
followed by applying the Pauli X operation with probability 1/2, as shown in [ BR03I . 
Intuitively, the Z operations will zero the off-diagonal elements of a density matrix 
(viewed in the computational basis), and the X operations will scramble the diagonal, 
resulting in the completely mixed state, 1/2, on each qubit. As the tensor product 
of two completely mixed qubits is the completely mixed state of the larger system, 
applying this construction to each qubit in a space S will implement the completely 
depolarizing channel on that space. 



In Section [73 the channel M was implemented as a completely depolarizing channel 



on the subspace Sj- of inputs not in the state |0) on the 'ancillary space' A. While the 
same channel suffices for the circuit case, it is not at all obvious how this channel can 
be implemented using only two-qubit mixed-unitary gates. This difficulty is avoided 
by implementing a closely related channel This construction is intuitively the same: 
it does not affect states in the subspace So of inputs with the |0) state in the space A, 
and it applies depolarizing noise to states in the space Sq. The difference is exactly 
how this noise is applied. The circuit that is constructed implements the operation M 
defined by 

f - |0)(0| + ® i:K if 1^0, 

[|0)(0|®p if 1 = 0, 

where |i|>i) is a nonzero computational basis state that depends on 1. The exact specifi- 
cation of this state can be extracted from the analysis of the circuit constructed for M, 
but this is not helpful. 

It is perhaps not a surprise that the transformation M. can be implemented using 
only controUed-mixtng operations. Before describing this implementation, notice that 
the controlled application of the completely depolarizing channel N to a single qubit 
can be described by a mixed-unitary circuit. This is because the previously discussed 
implementation of N is given by a mixture of the single qubit gates X and Z. Adding a 
control qubit to each of these gates results in two qubit gates, which fit into the model 
of mixed-unitary circuits used here (because we have assumed that X and Z, as well as 
controlled versions of them, are included in the standard basis of gates - dropping this 
assumption requires the circuit model to be generalized slightly). It is not clear that 
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Figure 7.2: One stage of the mixing procedure on the ancillary qubits. The mixing 
operations applied to the qubits in the space J{ are not shown. 

general controlled mixed-unitary operations can be implemented as mixed-unitary 
circuits in this model, but the only controlled operation that will be needed for this 
construction is the completely depolarizing channel. 

At an intuitive level, the implementation of the channel M. consists of the appli- 
cation of controUed-depolarizing operations everywhere that this is possible. These 
operations will all be controlled by the qubits in the space A, which ensures that in the 
case of a state in So, with 0) in the space A, the operation M. acts trivially. 

More formally, let m be the number of qubits in the space A that are given as part of 
the input to M, i.e. the number of ancillary qubits used to represent the ancillary space 
used by the original channel. The implementation of M. consists of ra stages, with the 
jth stage testing that the jth qubit of the space is in the |0) state, and mixing the qubits 



if this is not the case. An example of one stage of the circuit is given in Figure 7.2 The 
jth stage consists first of an application of the controlled N operation from the jth qubit 
to each other qubit of A0'K. After these operations, stage j is completed by ra — 1 
further controlled N operations: each with the jth qubit as the target qubit and one of 
the other qubits of A as the control qubit. An example of this construction with ra = 3 
is presented in Figure |7.3[ 

Given these circuit implementations of the three channels Dji, N^, M, the mixed- 
unitary circuit C that approximates a given circuit Q is constructed in exactly the 
same was as in Equation ( 7.12| >. More concretely, let Q be a circuit implementing the 
operation 

Q(p)=tra3U(|0)(0|®p)U*, 

where the ancillary qubits are in the space A. The circuit C that approximates it is then 
given by 

C(p) = (U [(M o Dyi)(p)] U*) . (7.23) 
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Figure 7.3: Circuit performing the ancilla simulation procedure M o Dji. The top 
three qubits simulate the ancillary qubits of the original circuit in the space A, and the 
bottom two simulate the input to the original circuit in the space Oi. The dashed lines 
separate the each of the three stages of the mixing procedure. 
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Figure 7.4: The constructed mixed-unitary circuit C that simulates the given circuit Q, 
with input and output Hilbert spaces marked. The circuit U is the unitary from a the 
implementation of Q in Stinespring form. The circuits Da, M/ and are as described 
in the text. 



This construction of the circuit C is shown in Figure 7.4 Notice that C is constructed 
to be a mixed-unitary circuit, as it the composition of smaller mixed-unitary circuits. 
Since the operations and M do not affect inputs of the form |0) (Oj ® p in the space 



So, the proof of Proposition 7.4 holds also for the circuit case, so that we have 

C(|0)(0| ® a] = Q(a)® is. (7.24) 

Combining this with equation ( |7.21| > and the fact that applying Da twice has no further 
effect, the output of C on an arbitrary input state p is of the form 

dim/l-l 

C(p)= Y. PiC(|i)(i|®Pi: 

i=0 i=l 

In the remainder of the chapter it is shown that this construction does not significantly 
alter the distinguishability properties of quantum circuits. 



dimyl-l 

PoQ(Po)®i2+ ptC(|i)(i|® p,). (7.25) 
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As a first step towards this, it is shown that the above circuit construction correctly 



implements the channel M described by Equation 7.22 Much of the proof of this 



lemma is similar to the proof of Lemma 7.7 but the operation M considered in this 



section is slightly different and the proof must be extended to the case where there is 
an additional reference system. 

This system, given by the space 3", is needed in the case of distinguishability, as 
a party attempting to distinguish two channels is permitted to use a portion of a 
larger entangled state as input to the channels. This is modelled by the use of the 



diamond norm in the definition of the computational problem QCD, from Section 5.2 
the hardness of which will be extended to the mixed-unitary case. 

Lemma 7.13. On input states of the form |k) (k| p G I>{A®^® 3^) for |k) (k| G D[A] 
with < k ^ 2^ — 1, the output of C satisfies 

II (C ® U)[\]c){k\ ® p) - 1a»j^ ® ^^^p\L ^ ^t;^' 
where ra is the number of ancillary qubits used by the circuit Q. 

Proof On input of the form |k)(k| (g) p the decoherence operations that are applied 
to the qubits in A can be ignored, as they have no effect on qubits in a state of 
the computational basis. As k ^ at least one qubit is in the state and so the 
controUed-mixing operations in the implementation of the channel M will have an 
effect. Let the first nonzero qubit among the qubits of A be the jth one. The first 
controlled N operation with nonzero control qubit that effects the jth qubit will be at 
the jth stage of the mixing process, where the jth qubit is the control qubit. As this 
qubit is not modified before this stage (as any previous qubits are in the state |0) by 
choice of j), the first ra — 1 gates in the jth stage will mix the remaining qubits, so that 



the state after these gates is, using Equation ( |7.5[ ), 

®ijc®tr:„ p, 

where for notational convenience the jth qubit has been written first, and A' is the 
space of all but the jth qubit of A. The remainder of the jth stage of the mixing 
process consists of ra — 1 controlled N gates with the jth qubit as the target, each 
controlled by one of the ra — 1 qubits in A'. Considering the state lyi//2"^^ on A' 
in the computational basis, the only term for which qubit j is not mixed by these 
operations is the all zero term. With this observation, the state after the jth stage is 
1 



)m-l 



, |0)(0| + |1)(1| nm/nn®m-i. 



|1)(1| ® (|0)(0|)^--^ + ' ' ^' ' ® [tj,, - (|0)(0|) 
= ^ ® 1:k ® trjc p. 



(g) 1 (g) trjc p 
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This proves that the circuit implementing the channel M does so correctly, as this 
quantity is exactly the state given in Equation ( |7.25| > with the addition of tr^c p in the 
reference system. 



As in the proof of Lemma 7.7 let this state be a. Computing the distance from this 



state to the desired one, we have 

||(T- ® i:K ® tr^ P||,, = ^ ® (|0)(0|)®— 1 - (|0)(01]®-||^^ = 



Finally, by noting that the remainder of the circuit C is mixed-unitary. Lemma 7.5 



implies that the rest of the circuit cannot increase the norm. □ 

In the next section this lemma is used to show that the hardness of the computational 
problem of distinguishing mixed-state circuits does not change when restricted to the 
mixed-unitary circuits. 



7.8 QlP-completeness of distinguishing mixed-unitary 
circuits 

The construction outlined in the previous section can be used to find mixed-unitary 
approximations to general quantum circuits, with the property that the diamond norm 
of the difference of two such circuits is approximately preserved. This property leads 
immediately to a proof that the problem of distinguishing mixed-unitary quantum 
circuits is QlP-complete, which is exactly as hard as the problem of distinguishing 
general quantum circuits. This will be done by taking the instance (Qi, Q2) of the 
general quantum circuit distinguishability problem (Problem |5.1[ >, and constructing 
the instance (Ci, C2) with Ci and C2 mixed-unitary, by applying the construction of 
Section [TiTI to each of these circuits. 

This technique produces an instance of the mixed-unitary quantum circuit dis- 
tinguishability problem, which is hereafter referred to as Mixed-Unitary QCD. This 
problem is identical to QCD with the exception that the input circuits are required to 



be mixed-unitary circuits, in the model defined in Section 7.7 This problem is more 
formally defined as 

Problem 7.14 (Mixed-unitary Quantum Circuit Distinguishability). For constants ^ 
b < a ^ 2, the input consists of mixed-unitary quantum circuits Ci and C2 that 
implement transformations in T(3-C, 3C). The promise problem is to distinguish the two 
cases: 
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Yes: II Ci - C2||^ ^ a. 
No: II Ci - C2II0 ^ b. 

As this problem is a restriction of the more general circuit distinguishability prob- 



lem, the protocol of Section 5.3 shows that it too is QIP. The remainder of this section 
is devoted to showing that Mixed-Unitary QCD is QlP-complete for all < b < a ^ 2 

The most important step in the proof that this restricted distinguishability problem 



is QlP-complete is to show that the construction in Section 7.7 does not significantly 
alter the diamond norm of the difference of the two circuits. This is the content of the 
following theorem. 

Theorem 7.15. Let Qi and Q2 be arbitrary circuits implementing transformations in T(IK, X), 
and let Ci be the mixed-unitary circuit constructed from Qt as in Equation ( 7.23| >. For any 

e >a 

II Qi - Q2IL ^ II Ci - C2 IL ^ II Qi - Q2 llo + e, 
where the circuits Ci and C2 use 0(log 1/e] extra qubits in the space A. 

Proof. The first inequality is not hard to show. Once again, the idea is sending the input 



state (|0)(0|]'^"^ ® p to the circuit Ci results in a simulation of Qi, by Equation 7.24 
This will imply that the distinguishability of Qi and Q2 cannot be greater than the 
distinguishability of Ci and C2. To formalize this argument, note that by the definition 
of the diamond norm 

IIQl-Q2|lo= sup ||(Ql®l:j)(p)-(Q2®l3^)(p)|ltr' 

peD(JC(8)3^) 

and fix 6 > and p as a state achieving a value within 6 of this supremum. By 



Equation 7.24 if the state [|0)(0|)®"^ ® p is given as input to the circuit Ci, then the 



output is(Qi(8)llg^)(p). Using this property we have 

||Ci - C2IL ^ II (Ci ® i:^)((|o)(o|)®- ® p) - (C2 ® i^)((|o)(o|)®- ® p) 11^^ 

= \\[Qi®tj)[9)-{q2®t^)[9)K 
> IIQ1-Q2II0-5. 

Since this is true for any 6 > 0, it must be the case that || Qi — Q2 |lo ^ II Ci — C2 \\^. 

The second inequality requires somewhat more work. The idea is to once again 
break the input space into two subspaces: the one on which the circuits Ci simulate the 



circuits Qi, and the orthogonal subspace. We will then use Lemma 7.13 to show that 
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on this orthogonal subspace the output states of the circuits Ct are almost completely 
mixed. This will in turn imply that the diamond norm on this input subspace is 
exponentially small, in the number of ancillary qubits added in the construction of 
the circuits Ci. An appeal to the decoherence operation applied as part of the circuit 
construction will validate the approach of treating the input state as a mixture of states 
from these two orthogonal subspaces. 

More formally, let ra be the number of ancillary qubits (the space A) and let n be 
the number of input qubits (the space Oi) used by the circuits Qt. It can be assumed 
that the circuits Qi and Q2 both use the same number of ancillary qubits by padding 
one of the circuits with unused qubits that are later traced out. The values of n and m 
may also be expressed as m = [log dim A] and n = [log dim J-C] . By adding at most 
3 + log(l/e) extra ancillary qubits to the space A, we may assume that 



(m-3) 



< e 



Let p G D(yi (g) J{ (g) 3") be a state such that 



C1-C2 



e/2^ UC^^U]{p]-{C2^U]{p]\ 



(7.26) 



(7.27) 



and note that the reference system J' need not have the same dimension as the space 
of the same name considered in the proof of the previous inequality. The first gates 
applied in the circuit Ci are the decoherence gates applied to A. These gates produce a 
state of the form ^I^q^ Pi|i) (i| ® (Ji, for {pt} a probability distribution. Since applying 
these decoherence operations twice has no further effect, the output of the circuits Ci 
and C2 is the same on p as it is on this state. Applying the this property and triangle 



inequality to Equation ( |7.27| >, the quantity of interest becomes 

II Ci - C2IL - e/2 ^ Vi II (Ci ® ll:r)(|i)(i| ® o-O - (C2 ® t:^][\i){i\ 



(7.28) 



Then, by applying Lemma 7.13 to each term with 1^0, the states in the norm can be 
replaced with completely mixed states onA^^ plus a small correction factor. Doing 
this for each of these terms we have 

Pi II (Ci ® l:r)(|i)(i| ® cTt) - (C2 ® ® crOUt, 



^Pi 



:Y + II lyi®:K ® trjc fft - lA(g,:K ® trjc ai\\^^ 



Applying this to Equation ( |7.28[ > results in 

II Ci - C2IL- e/2 ^ po II (Ci ® l:r)(|0)(0| ® CTq) - (C2 ® U]{\0){0\ ® cTo)|ltr+ Y. ^^^I'^' 



i=l 



158 



By Equation 7.24 the output of the circuit Ci on this input can be replaced the output 
of the circuit Qi and a maximally mixed state. When this is done to the previous 
equation, the desired bound is given by 

II Ci - C2IL ^ Po II (Qi ® U){(yo) ® is - (Q2 ® U]iuo] ® is 11^^ + (1 - po)e/2 + e/2 
^ Po II (Qi ® U){(yo) - [Qi ® l:?)(o-o] lltr + e 
^ ||Qi-Q2lL + e- 

This completes the proof of the theorem, since ^ po ^ 1. □ 

The quantum circuit distinguishability problem is defined in terms of the diamond 
norm of the difference of two circuits. The bounds on this quantity provided by the 
previous theorem immediately imply the following corollary. 

Corollary 7.16. For any Q < h < a ^ 2 the problem Mixed-Unitary QCD^ ^ is QIP- 
comflete. 

Proof. Starting with an instance (Qi, Q2) of QCDa,h/i ^rid applying the construction 



of Section 7.7 to each circuit results in a pair (Ci, C2]. By Theorem 7.15 on positive 



instances of QCD we have 

IIC1-C2IL ^||Qi-Q2lL^a, 
and on negative instances we have 

||Ci-C2|L^||Qi-Q2lL + e^^ + e- 

By adding O(logl/b) extra qubits in the construction of each of the circuits Ci, we 
can make e < b/2, so that the resulting pair (Ci,C2] is an instance of the problem 
Mixed-Unitary QCD^ t,, provided that b > 0. 

To see that the reduction can be implemented efficiently, notice that the circuit 
construction can be done in polynomial time, since we have only added 0(logl/b] 
qubits, as well as a few operations that do not depend on the actual input circuits, just 
the number of qubits they act on. □ 

A natural question to ask is whether this hardness result can be extended to the case 
of log-depth mixed-unitary circuits. With the construction of Section [TT] this does not 
appear to be possible: the controlled-mixing operation in the procedure for mixing the 
input qubits if any ancillary qubits are not in the |0) state requires mixing operations 
to be applied to all other qubits, controlled by each of the qubits in the space A. This 
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results in a linear depth circuit, since each input qubit is the target of at least log dim A 
mixing operations. Another approach might be to apply a construction similar to that 
used in Chapter |4] to a mixed-unitary Close Images problem. This approach runs 
into an immediate problem: all mixed-unitary circuits are unital, and so they must all 
output the completely mixed state when given one as input, i.e. the images of any two 
mixed-unitary transformations intersect at the completely mixed state, trivializing the 
close images problem on mixed-unitaries. 



7.9 Conclusion 

In this chapter a few problems are shown to be no easier when restricted to the class 
of mixed-unitary channels. This is done using a technique by which a channel is 
approximated using a mixed-unitary channel. While this approximation does not, in 
general, look very much like the original channel, for measures based on the behaviour 
of the channel on low-entropy outputs this approximation can be made arbitrarily 
good by padding the input channel with unused ancillary space. The approximation 
technique overcomes two main hurdles: viewed as a circuit, the original channel may 
trace out qubits and it may introduce fresh ancillary qubits in some pure state. The 
partial trace can be easily simulated by the mixed-unitary channel that maps any state 
to the completely mixed state, but a more complicated construction is required to deal 
with the ancillary space. 

This construction is applied to the maximum output p-norm and the minimum 
output entropy, where it is used to show that the multiplicativity or additivity of a 
channel is implied by the additivity of multiplicativity of a related set of mixed-unitary 
channels. This can be used to show that the general multiplicativity and additivity 
problems are equivalent when restricted to the mixed-unitary channels, but this is no 
longer an interesting result, since mixed-unitary channels that are not additive BHas09l 
and not multiplicative |HW08| have recently been discovered. 



When applied to the computational problem of distinguishing two transformations, 
this approximation scheme proves that this problem remains QlP-hard when restricted 
to mixed-unitary inputs. This is perhaps surprising: mixed-unitary channels have 
several nice properties IIGW03L but computational distinguishability does not appear 
to be one of them. This can be seen as evidence that, despite the nice properties enjoyed 
by these channels, they may be sufficiently general to be a useful model of noise in 
quantum systems. 
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Chapter 8 



Conclusion 



This thesis has introduced the problem Quantum Circuit Distinguishability, which 
is a computational version of channel distinguishability. This problem has been shown 
to be hard for the class QIP of problems that have quantum interactive proof systems, 
which is equal to the more familiar class PSPACE | UJUW09 [. This problem gives a new 



quantum characterization of this class that is not closely tied to quantum interactive 
proof systems. 

The hardness of the distinguishability problem leads naturally to a study of the 
problem on restricted classes of channels. This can be seen as an attempt to isolate 
those instances of the problem that are hard, so that more is known about those 
instances on which the problem is tractable. This thesis has presented four important 
classes of channels for which the problem remains hard: the channels implemented 
by log-depth circuits, the degradable channels, the antidegradable channels, and the 
mixed-unitary channels. These special cases demonstrate that the QCD problem is 
hard on a wide array of classes of channels, i.e. that the hardness of the problem is 
very likely not tied to just a few hard instances. 

These hardness results are shown by reducing the general problem to a version 
of the problem restricted to a special class of channels. These reductions can have 
applications outside of complexity theory, as they are essentially methods for the 
simulation of general channels by channels in a restricted class. These simulation 
techniques can have powerful implications throughout quantum information. One 
example of such an application is the result that the additivity of the Holevo capacity or 
the multiplicativity of the maximum output p-norm can be (approximately) restricted 
to a mixed-unitary channel. It is hoped that the other reductions presented in the 
thesis will find similar applications. 
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Several natural questions are left open by this thesis. Several of the more interesting 
of these questions are summarized below. 



• There are other models of computation that involve unitary computations ap- 
plied to mixed initial states (see IIASV06[fKL98l ). How hard is the distinguisha- 
bility problem for computations in these models? 

• What is the complexity of distinguishing constant depth quantum circuits that 
do not use the unbounded fan-out gate? This problem is hard on constant depth 
circuits that have access to this gate, but the reduction used in the thesis does not 
produce constant-depth circuits without access to this gate. 

• How hard is QCD on the entanglement-breaking channels? It was argued in 



Section 1.3 that this problem is hard for channels that are exponentially close 
together, but this problem is not very interesting with such a weak promise. It 
is not known how hard the distinguishability problem is on this subset of the 
antidegradable channels with a stronger promise. 

The random Pauli channels, also known as the Pauli diagonal channels, can be 
expressed as convex combinations of the channels that apply the discrete Weyl 
(or generalized Pauli) operators. These channels are an important subclass of 
the mixed-unitary channels. Many of the pieces used in the reduction to mixed- 
unitary channels can be expressed as channels of this form, but there is one major 
problem: the unitary U from a Stinespring representation of a general channel 
needs to be converted to such a channel. Such a simulation result would imply 
that the QCD problem is also hard on this class. 
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